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Abstract 

In 1837, Dirichlet proved that there are infinitely many primes in any 
arithmetic progression in which the terms do not all share a common 
factor. We survey implicit and explicit uses of Dirichlet characters in 
presentations of Dirichlet 's proof in the nineteenth and early twentieth 
centuries, with an eye towards understanding some of the pragmatic pres- 
sures that shaped the evolution of modern mathematical method. We 
also discuss similar pressures evident in Frege's treatment of functions, 
and the nature of mathematical objects. 
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1 Introduction 

Historians commonly take the "modern" age of mathematics to have begun 
in the nineteenth century. But although there is consensus that the events of 
that century had a transformational effect on mathematical thought, it is not 
easy to sum up exactly what changed, and why. Aspects of the transformation 
include an increasingly abstract view of mathematical objects; the rise of al- 
gebraic methods; the unification of disparate branches of the subject; evolving 
standards of rigor in argumentation; a newfound boldness in dealing with the 
infinite; emphasis on "conceptual" understanding, and a concomitant deempha- 
sis of calculation; the use of (informal) set-theoretic language and methods; and 
concerns to identify a foundational basis to support the new developments^ It 
is still an important historical and philosophical task to better understand these 
components, and the complex interactions between them. 

A good deal of attention has been given to early appearances of set-theoretic 
and structural language, including the use of equivalence relations and ideals in 
algebra; the expansion of the function concept in analysis, and its generalization 
to other mathematical domains; foundational constructions of number systems 
from the natural numbers to the reals and beyond; and the overall conception 
of mathematics as the study structures and spaces, often characterized in set- 
theoretic terms [41]. These provide relatively clear and focused manifestations 
of the changes that took place. 

In this essay, we will consider certain functions, known as "Dirichlet charac- 
ters," and their role in proving a seminal 1837 theorem due to Dirichlet, which 
states that there are infinitely many prime numbers in any arithmetic progres- 
sion in which the terms do not all share a common factor. As far as abstract 
objects go, characters are fairly benign. For a given positive integer m, there are 

^The essays in Ferreiros and Gray | 42| provide an overview. 
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only finitely many characters modulo m, and each one can be described exactly 
by giving its value on the finitely many residue classes modulo m. Moreover, 
it is not terribly hard to provide a concrete and exhaustive description of the 
set of all such characters. Nonetheless, we will argue that the evolution of the 
treatment of characters over the course of the nineteenth century illustrates 
important themes in the overall transformation of mathematical thought. In- 
deed, we will try to call attention to changes in mathematical method that have 
by now become so ingrained that it is hard for us today to appreciate their 
significance. 

This paper can be read in different ways. It can be viewed, narrowly, as 
a study of the history of Dirichlet's theorem, with an attempt to understand 
the pragmatic pressures that shaped that history. More grandiosely, it can be 
viewed as an attempt to come to terms with the nature of mathematical method 
and the objects of mathematical study. Along the way, we make some tentative 
attempts to integrate the narrow and grandiose views. 

The first such attempt begins in the next section where, to provide philosoph- 
ical context, we consider traditional metaphysical concerns in the philosophy of 
mathematics, and the ways our case study bears on these concerns. In Sec- 
tion [21 we provide historical context as well, by situating our study amidst a 
host of topics related to the development of the modern function concept. In 
Section [H we discuss contemporary presentations of Dirichlet's proof, and in 
Section [SJ we highlight various senses in which these presentations treat char- 
acters as objects. In contrast. Section |6] describes Dirichlet's own presentation 
of his proof, in which the notion of a character does not figure at all. Section [7] 
then traces a gradual transition, as characters are transformed from shade-like 
entities in the original proof to the fully embodied objects we take them to be 
today. Section [5] analyzes the forces that shaped the transition. We return, 
in Section [9l to metaphysical considerations, as we consider Frege's conflicted 
attitudes towards the treatment of functions as objects. In Section [TOl we argue 
that there are important commonalities between Frege's foundational concerns 
and mathematical concerns regarding the proper treatment of functions. Fi- 
nally, in Section we bring the various strands of our narrative to a point of 
closure. 

2 From methodology to metaphysics 

The philosophy of mathematics has long been concerned with the nature of 
mathematical objects, and the proper methods for gathering mathematical 
knowledge. But as of late some philosophers of mathematics have begun to 
raise questions of a broader epistemological character: What does it mean to 
properly understand a piece of mathematics? In what sense can a proof be said 
to explain a mathematical fact? In what senses can one proof be viewed as 
better than another that establishes the same theorem? What makes a concept 
fruitful, and what makes one definition more natural or appropriate than an- 
other? Why are certain historical developments viewed as important advances? 
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Questions like these are sometimes classified as pertaining to the methodology 
of mathematics, in contrast to more traditional metaphysical concerns. In this 
section, we will argue that this distinction is unfortunate, in that a substantive 
metaphysics has to engage with methodological considerations. In particular, 
we will explain the sense in which we take it that the methodological study we 
undertake below bears on a metaphysical understanding. 

Let us start by distinguishing between two kinds of questions one can ask, 
having to do with the existence of mathematical objects. On the one hand, we 
can ask question such as: 

• Is there a nontrivial zero of the Riemann zeta function whose real part is 
not equal to 1/2? 

• Are there noncyclic simple groups of odd order? 

These are fundamentally mathematical questions. Answering them is not easy: 
the Riemann hypothesis posits a negative answer to the first, while the Feit- 
Thompson theorem, a landmark in finite group theory, provides a negative an- 
swer to the second. But even in the first case, where we do not know the answer 
to the question, we feel that we have a clear sense as to what kind of argument 
would settle the issue one way or another. Put simply, questions like these can 
be addressed using conventional mathematical methods. In contrast, there are 
questions like these: 

• Do the natural numbers (really) exist, and what sorts of things are they? 

• Are there infinite totalities? 

• What kind of sets and functions exist (if any), and what properties do 
they have? 

• Are there infinitesimals, fluxions, fluents, and ultimate ratios? 

These are questions as to the ultimate nature of mathematics and its objects of 
study, and seem to call for a more general, open-ended philosophical analysis. 

The distinction between the two types of questions may call to mind the log- 
ical positivists' distinction between questions that are "internal" to a linguistic 
framework, and "external" or "pragmatic" questions pertaining to the choice 
of a framework itself. Some take this distinction to be have been repudiated, 
decisively, by the criticisms of W. V. O. Quine [85|. But keep in mind that 
Quine's arguments, which were directed against the claim that there is a sharp, 
principled distinction between the two sorts of questions, were not meant to 
show that there is no difference between them at all. In locating both kinds of 
questions on the common continuum of scientific inquiry, he did not deny that 
different kinds of questions call for different sorts of answers; indeed, his influen- 
tial Word and Object [SS] is an extended exploration of the considerations that 
he took to bear on "philosophical" questions of the latter sort. Nothing we say 
below commits us to a sharp distinction, and it seems relatively uncontroversial 
to say that insofar as any rational arguments can be brought to bear on the 
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second group of questions, these will look different from the kinds of arguments 
that are brought to bear on the first. 

So how shall we go about trying to answer them? Let us consider two 
approaches. 

The first is to turn to a foundational theory, like set theory, for answers. 
Fixing a definition of the natural numbers provides an answer to the first of the 
more philosophical questions, and, at the same time, a positive answer to the 
second. Sets and functions are then understood to have (at least) the properties 
that the axioms of set theory say they have, and Leibniz' infinitesimals and 
Newtons' fluxions, fluents, and ultimate ratios can be interpreted in various 
ways, all perfectly well understood. In this way, foundational reduction serves 
to reduce philosophical questions to properly mathematical ones, by interpreting 
them in straightforward mathematical ways. 

Foundational reduction of this sort has been one of the most significant con- 
tributions of philosophical inquiry to the sciences, providing mathematics with 
a uniform language for constructing new objects, and clear means of investi- 
gating their properties. One might object to this approach to answering our 
"philosophical questions" on the grounds that there are multiple foundational 
theories on offer. After all, it would be disappointing if adjudicating ontological 
questions required obtaining consensus as to whether, for example, set-theoretic 
foundations or category-theoretic foundations should have the final say. More- 
over, even a particular choice of foundation leaves us with too many options; 
once again, it would be disappointing if our ontological deliberations required 
us to come to a consensus as to whether the natural numbers are "really" von 
Neumann ordinals or elements of some other isomorphic structure. But foun- 
dational frameworks tend to be largely inter-interpretableU and the structural 
nature of mathematics tempers the problem of the overspecificity of a partic- 
ular choice of definitions. We can characterize the natural numbers uniquely 
up to isomorphism, and know full well that for most purposes it makes no dif- 
ference which structure we choose to represent them. With caveats like these, 
foundational reduction does quite a good job at what it was designed to do. 

We may nonetheless feel that our philosophical questions are trying to get 
at something that foundational reduction fails to address. That is, we may view 
them as asking whether, in some broad sense, we are justified in admitting the 
entities in question into our scientific discourse, and attributing those properties 
to them that our foundational theories are supposed to confirm. In that respect, 
set-theoretic reduction only pushes the problem back to asking whether we are 
justified asserting the existence of sets with the properties that our axioms say 
they have, and likewise for category-theoretic reduction and other alternatives. 
To avoid appealing to innate or supernatural acquaintance with the objects 

^To be sure, it can be hard to interpret a nonconstructive theory in a constructive one, and 
there some who reasonably hold that, to be meaningful, mathematics should be constructive. 
So there are substantial foundational issues that arc still subject to debate, and we do not mean 
to suggest otherwise. Indeed, as will become clear below, we believe that the approach we 
describe in this essay does have bearing on issues relating to the tension between constructive 
and nonconstructive methods in mathematics. 
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posited by our foundational theories, we then find ourselves pushed to reverse 
course and justify those theories by the fact that they give us a uniform and 
useful account of mathematical objects that, antecedently, we have good reasons 
to be committed to. And those reasons seem to be exactly the sort of thing that 
our philosophical questions aim to uncover. 

Another shortcoming of foundational reduction is a lack of specificity. By 
aiming for flexibility and universality, foundational frameworks allow us to con- 
trive an endless stream of definitions, the vast majority of them totally useless. 
In other words, a foundational theory tells us what we are allowed to do, not 
what we ought to do. One may therefore interpret our philosophical questions as 
part of the broader question as to what sorts of mathematical objects ought to 
be central to our mathematical and scientific discourse. Axiomatic foundations 
per se are not designed to provide this kind of guidance. 

Since mathematizing the philosophical questions failed to provide the right 
sort of answer, we can move to the other extreme, and try to settle them on 
purely philosophical grounds. For example, in recent years antirealist denials 
of the existence of abstract mathematical objects have become popular, cast as 
forms of "nominalism" or "fictionalism."|f| What these positions tend to have 
in common is that they are decidedly not concerned as to whether abstract 
mathematical objects should be employed by mathematicians, scientists, psy- 
chologists, or logicians in their daily discourse; questions like that are viewed as 
belonging to the realm of methodology, rather than metaphysics proper. Rather 
than presume to tell mathematicians what they ought to be doing, contemporary 
metaphysicians focus their efforts on the more fundamental task of supplying 
a philosophically sound interpretation of ordinary mathematical language and 
determining whether, from a broad philosophical perspective, mathematical ob- 
jects "really" exist. 

There are at least two reasons to be dissatisfied with such an understanding 
of metaphysics. The first is that, once ties to any pragmatic context have 
been severed, only vague intuitions and aesthetic judgments can be brought 
to bear on the fundamental questions. These can pull in opposite directions. 
For example, some take it as a fundamental tenet of naturalism that existence 
claims in the sciences should be taken at face value; since seven is a prime 
number, prime numbers, and hence numbers, exist. Other naturalists wield 
Occam's razor as their weapon of choice, and prefer, in the name of science, to 
discount scientists' casual existence claims. One man's meat is another man's 
poison; John Burgess dismisses the act of taking back one's scientific claims in 
one's more philosophical moments as doublethink, while denouncing Occam's 
razor as "medieval superstition" [TOl p. 39]. But rhetorical flourishes like these 
are unlikely to settle the issue as to whether a minimal ontology is worth a 
discordant science. Indeed, Maddy [75] has argued that realism and antirealism 
are, inherently, equally tenable, and Balaguer [7] has argued that, moreover, 
there is no fact of the matter as to which of them is true. This seems to be 
a serious drawback to the abstractly philosophical approach, namely, that it 

^See, for example, Leng |71| . 
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reduces us to philosophical arguments as to whether there is really anything at 
stake. This is not to deny that intuitions and aesthetic judgments have bearing 
on the acceptability of a metaphysical theory, but rather, to emphasize that 
intuitions and aesthetics, on their own, are not enough: there has to be some 
content to the theory as well. 

There is a further reason to ask more from metaphysics. We have rejected a 
wholesale internalization of the subject as failing to provide useful insight as to 
which mathematical objects are worthy of our attention, and why. The purely 
philosophical approach fares even worse in this respect: faced with the question 
as to which particular mathematical objects exist, the realist tells us simply 
that they all do, while the anti-realist tells us that none of them do. In the end, 
neither view has anything to say about which objects matter to us, and why. 

The state of affairs leaves one longing for the days when metaphysics mat- 
tered. From Aristotle to Descartes, and then to Frege, Russell, and early twenti- 
eth century philosophers of mathematics, the principal philosophical task was to 
determine the proper language and method of mathematics, including questions 
as to what objects we should admit into our ontology and how we should rea- 
son about them. In this tradition, contemporary metaphysics should provide 
us with better means of evaluating our methodological choices, and weighing 
the pros and cons of our decisions. Despite their different characterizations of 
the philosophical project, Carnap and Quine share the view that metaphysical 
questions come down to pragmatic questions as to the choice of a conceptual 
framework. Here is what Carnap has to say about our scientific commitments 
to abstract objects: 

The acceptance cannot be judged as being either true or false because 
it is not an assertion. It can only be judged as being more or less 
expedient, fruitful, conducive to the aim for which the language is 
intended. Judgments of this kind supply the motivation for the 
decision of accepting or rejecting the kind of entitiesEI [El P- 250] 

Quine offers the following amendment: 

Consider the question whether to countenance classes as entities. 
This, as I have argued elsewhere, is the question whether to quantify 
with respect to variables which take classes as values. Now Carnap 
has maintained that this is a question not of matters of fact but 
of choosing a convenient language form, a convenient conceptual 
scheme or framework for science. With this I agree, but only on the 
proviso that the same be conceded regarding scientific hypotheses 
generally. [85l p. 43] 

We take these views seriously here, seeing it as an important metaphysical 
task to clarify the role of our ontological posits with respect to ordinary mathe- 

^Here and below, when the bibhographic entry of a work includes a reprinted version, page 
numbers in the references refer to the reprinted version. Similarly, when the bibliographic 
entry includes an English translation, our translations are taken from that source, unless we 
indicate otherwise. Where no translation is listed, the translations are our own. 
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matical activity, and evaluate their efficacy towards achieving our mathematical 
goals. This amounts to something like the naturalist approaches to the philoso- 
phy of mathematics as advocated by Kitcher [BDj , Burgess [12] , and Maddy [73] , 
focused on specific aspects of mathematical practice. The task is by no means 
easy, not least since it is not at all clear how to characterize our mathematical 
goals or evaluate our success at achieving them. But philosophy should not shy 
away from hard work. 

On this view, metaphysics becomes a kind of means-ends analysis. It is 
therefore important to keep in mind that scientific ends are usually best de- 
scribed relative to a scientific context. While a mathematician may have good 
reason to admit certain abstract objects with all their attendant properties, 
a logician may have good reason to adopt a more minimal ontology when it 
comes to studying the metamathematical effects of our axiomatic assumptions; 
and there may be good reasons for psychologists to avoid reference to abstract 
objects altogether in an account of how children learn to grasp mathematical 
concepts. It seems fruitless to grant any of these contexts the final say as to 
whether mathematical objects "really" exist; the objects in question play dis- 
tinct roles in distinct theories with distinct explanatory and predictive goals. 
In situations where the scientific contexts overlap, some translation and rein- 
terpretation is inevitable, and part of the metaphysician's task is to help keep 
these cross-disciplinary collaborations running smoothly. Our view does not 
rule out hope for an overarching frame that is broad enough to fit together the 
various scientific snapshots; but we do insist that, ultimately, the success of such 
a metaphysics needs to be judged in terms of its contribution to the pragmatic 
needs of the working scientist. 

In this essay, we will focus on the way that ontological commitments bear 
on distinctly mathematical concerns, supporting us in or hindering us from 
achieving our mathematical goals. How, then, should such an analysis proceed? 

It is instructive to consider those historical situations in which the mathe- 
matical community faced possibilities for methodological or ontological expan- 
sion and reacted accordingly. For example, it is helpful to consider the ancient 
Greek idealizations of number and magnitude, and the theory of proportion; 
the gradual acceptance of negative numbers, and then complex numbers, in the 
Western tradition; the use of algebraic methods in geometry, infinitesimals in 
the calculus, points at infinity in projective geometry; the development of the 
function concept from Euler to modern times; the gradual set-theoretic treat- 
ment of algebraic objects like cosets, ideals, equivalence classes in the nineteenth 
century; and so on. By studying the historical concerns regarding these expan- 
sions as well as the pressures that led to their ultimate acceptance, we can hope 
to better understand the metaphysical considerations that can be brought to 
bear. 

Indeed, at junctures like these, historical developments tend to follow a com- 
mon pattern. First, expansions are met with resistance, or at least, extreme 
caution. Sometimes, the expansions can be explained in terms of the more con- 
servative practice; for example, complex numbers can be interpreted as ordered 
pairs, algebraic solutions to geometric problems can be reinterpreted geometri- 
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cally, equations can be rewritten to avoid consideration of negative quantities. 
In other cases, the expansions are not generally conservative, but, at least, can 
be explained away in particular instances; for example, arguments involving 
infinitesimals can sometimes be interpreted in terms of "ultimate ratios" in a 
geometric diagram, and operations on abstract objects can sometimes be un- 
derstood as operations on explicit representations. This makes it possible to 
adopt the expansions, tentatively, as convenient shorthand for more tedious but 
conservative arguments. Over time, the rules and norms that govern the ex- 
pansions are clarified, and the expansions themselves prove to be convenient, or 
even indispensable, while they do not cause serious problems. Over time, the 
mathematical community grows used to them, to the point where they become 
part of the usual business of mathematics. 

Whiggish narratives tend to dismiss such historical hand- wringing and shilly- 
shallying as short-sighted conservativism on road to progress. We, however, 
prefer to view it as a rational response to the proposed expansions, whereby the 
benefits are carefully weighed against the concerns. In hindsight, we tend to 
make too little of the pitfalls associated with an ontological or methodological 
expansion. To start with, there are concerns about the consistency and coher- 
ence of the new methods, that is, worries as to whether the changes will lead to 
mistakes, false results, or utter nonsense, perhaps when employed in situations 
that have not even been imagined. Kenneth Manders has also emphasized the 
importance of maintaining control of our mathematical practices [75]. Math- 
ematics requires us to be able to come to agreement as to whether a proof is 
correct, or whether a given inference is valid or not. If new objects come with 
rules of use that are not fully specified, or vague, or unclear, the practice is 
in danger of breaking down. In a sense, this concern comes prior to concerns 
of consistency: if it is not clear what properties abstract magnitudes, negative 
numbers, complex numbers, infinitesimals, sets, and "arbitrary" functions have, 
it doesn't even make sense to ask whether using them correctly will lead to 
contradictions 

And then there are further concerns as to whether the new methods are 
meaningful and appropriate to mathematics. Even if a body of methods is 
consistent and clearly specified, it may still fail to provide us with the results we 
are after. If you expect an existence proof to yield certain kinds of information 
about the object that is asserted to exist, methods that fail to provide that 
sort of information do not constitute mathematics — or, at least, not the kind of 
mathematics we should be doing. If you expect a mathematical theory to make 
scientific predictions that we can act on rationally, it is a serious concern as to 
whether the new methods can deliver. 

In short, the concerns are not easily set aside. What, then, are the factors 
that might sway a decision in favor of an expansion? Mathematicians tend to 
wax poetic in their praise of conceptual advances, highlighting the power of new 
methods, the elegance and naturality of the resulting theory, and the insight and 

5 As Wilson [101] and [93 emph asize, however, mathematics often gets by surprisingly well 
with concepts that axe problematic, incompletely specified, and not fully understood. 
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depth of the associated ideas. Part of our goal here is to de-romanticize these 
virtues and gain clarity as to what might be achieved. In many instances, the 
virtues in question seem to have a lot to do with efficiency and economy of 
thought jfl we tend to value methods that make it possible to solve problems 
that were previously unsolvable, or simplify proofs and calculations that were 
previous tedious, complex, and error-prone. Below we will consider specific ways 
in which ontological and methodological expansions help us manage complex 
tasks by suppressing irrelevant detail, making key features of a problem salient, 
and keeping key information ready-to-hand. We will also try to understand the 
way they make it possible to generalize and extend results, and facilitate the 
transfer of ideas to other domains. 

To summarize our high-level historical model: when mathematics is faced 
with methodological expansion, benefits such as simplicity, generality, and effi- 
ciency are invariably weighed against concerns as to the consistency, cogency, 
and appropriateness of the new methods. Sufficient benefit encourages us to 
entertain the changes cautiously, while trying to minimize the dangers involved. 
Cogency is obtained by working out the norms and conventions that govern the 
new methods. Consistency may not be guaranteed, but our experiences over 
time can bolster our faith that the new methods do not cause problems. In this 
regard, initial checks that the new methods are partially conservative over the 
old ones helps preserve mathematical meaning, and reassures us that even if 
the new methods turn out to be problematic, one will be able to restrict their 
scope in such a way that preserves their utilityQ A metaphysics of mathematics 
should give us better means to evaluate such expansions: to talk about co- 
gency of a mathematical argument and whether it delivers the desired content; 
and to understand the ways in which our ontological posits and methodological 
expansions improve our ability to reason effectively. 

Philip Kitcher invoked a similar model in his important book. The Nature of 
Mathematical Knowledge |59j . Chapter 9, "Patterns of mathematical change," 
discusses the historical tendency to weigh "problem-solving dividends" against 
the perceived costs of an extension, which stand "on the other side of the ledger" 
[Sni p. 199]. Kitcher's case studies, which include the introduction of Cartesian 
methods to geometry and nineteenth century developments in analysis, highlight 
considerations very similar to the ones we discuss here. One small difference is 
that Kitcher focused on foundational developments in which the methods under 
consideration make it possible to solve problems that were previously unsolvable 
while, at the same time, posing clear problems, such as false conclusions and 
uncomfortable gaps in reasoning. Below, we focus on evolving presentations of 
a proof whose correctness was never in question, showing that such cost-benefit 
considerations bear upon everyday mathematical developments as well. A more 
striking difference is that Kitcher invokes his historical model in support of an 
empiricist epistemology of mathematics that runs counter to what he calls the 

^The phrase is borrowed from Ernst Mach's The Science of Mechanics 73 ; we are grateful 
to Michael Detlefsen for bringing this to our attention 

^Wittgenstein's discussion of contradiction is interesting in this regard; see |105l Lectures 
XI-XII]. 
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traditional "apriorist program." His concerns trace back to the Carnap-Quine 
debate as to whether one can, and should, maintain some sort of principled 
distinction between analytic and synthetic knowledge. These issues are largely 
orthogonal to the concerns addressed here: although our case study can support 
a better understanding of the kinds of rational considerations that bear upon 
mathematical developments, it is not wedded to any particular stand on the 
analytic / synthetic distinction. 

With this long-winded discussion behind us, we can now state our goals 
more clearly. In this essay, we will explore the evolution of the mathematical 
treatment of certain types of functions in nineteenth century mathematics, from 
an appropriately metaphysical point of view. 

3 Functions in the nineteenth century 

3.1 The generalization of the function concept 

Nineteenth century mathematicians dealt with a number of objects that we, 
today, view as instances of the function concept. Analysis dealt with functions 
defined on the continuum, that is, functions from the real numbers, R, to M. 
Cauchy and others extended the subject to include functions from the complex 
numbers, C, to C. Analysts also commonly dealt with sequences and series, 
which can be viewed as functions from the natural numbers, N, to a target 
domain, such as the integers or the reals. Number theory dealt with functions 
like the Euler totient function, tp, which maps N to N. Geometry dealt with 
transformations of the Euclidean or projective plane, which can be viewed as 
invertible maps from the space to itself. Galois' theory of equations focused 
on substitutions, or permutations, of a finite set A, which is to say, bijective 
functions from A to A. Below we will see how developments in number theory 
led to the study of characters, that is, functions from the integers, Z, to C, 
or from the group (Z/mZ)* of units modulo m to C. By the end of the cen- 
tury, Dedekind and Cantor considered arbitrary mappings, or correspondences, 
between domains. 

Despite this diversity, most of the literature on the evolution of the function 
concept (including [TOIIS^ [72117911107] ) has focused on functions on the real or 
complex numbers. Surveys typically trace the evolution of the concept from 
the introduction of the term "function" (functio) by Euler in 1748, to the 
mention of "arbitrary functions" (fonction arbitraire) in the title of Dirichlet's 
seminal paper of 1829 [l^, to the dramatically opposed treatment of the func- 
tion concept by Riemann and WeierstrassH There is a very good reason for 
this: in the early nineteenth century, the word "function" was used almost ex- 
clusively in this narrow sense. Indeed, there is little in the early nineteenth 
century literature that suggests any family resemblances between the kinds of 

*See Bottazzini and Gray 1521 for a history of complex function theory, which includes a 
detailed treatment of the work of Riemann and Weierstrass, in particular. 
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mathematical objects enumerated in the last paragraph, let alone subsumption 
under a single overarching concept. 

For example, sequences of real numbers were referred to as sequences or 
series, or often just introduced by displaying the initial elements ao,ai,a2, . . . 
with an ellipsis at the end. The words "series" {serie in French, Reihe in 
German) was also used to describe a finite or infinite sum. For example, Dirichlet 
wrote the following in 1835: 

Now let 

F{a) = bo + bi cos a + 62 cos 2a + . . . = bi cos ia 

be an arbitrary finite or infinite series, whose coefficients are inde- 
pendent of a0 [H p. 249] 

The dependence of the coefficient bi on i was indicated by the subscript, just as 
we do today. 

Terminology governing number-theoretic functions was more varied. In the 
eighteenth century, Euler tended to refer to number-theoretic functions as "sym- 
bols" or "characters." For example, when introducing what we now call the 
totient function in 1781, he wrote: 

... let the character ttD denote that multitude of numbers which are 
less than D and which have no common divisor with it0 [371 P- 2] 

In 1801, in §38 of the Disquisitiones Arithmeticae |50j . Gauss introduced the 
totient function as a notation: 

For brevity we will designate the number of positive numbers which 
are relatively prime to the given number and smaller than it by the 
prefix ip. We seek therefore cg^l"1 

In the following section, he referred to the "character" (p. Later, in §52, he 
defined another number-theoretic function, tp, and in §53 compared it to the ip 
"symbol" (signo). In the Disquisitiones, the word "function" is never used to 
describe such entities. 

Early in the nineteenth century, in projective geometry, Mobius studied gen- 
eral "relations" between figures in the plane, such as "affinities" and "collineations. 
In his Erlangen Program of 1872, Klein considered invertible "transformations" 
( Transformationen) of a space, and the group they form under the operation 

3"Es sei jetzt: 

F(a) = bo + bi cos + 62 cos 2a + . . . = y ^ bi cos ia 

eine beliebige endliche oder unendliche Reihe, deren CoefEcienten von a unabhangig sind." 

"Quod quo facilius praestari possit, denotet character ipD multitudinem istam numerorum 
ipso D minorum, et qui cum co nullum habeant divisorem communem." [371 p. 19] 

"Designemus brevitatis gratia multitudinem numerorum positivorum ad numerum datum 
primorum ipsoque minorum per praefixum characterem ip. Quaeritur itaque ipA." 
^^See the discussion in Wussing [1061 pp. 35-40]. 
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of "composition" (Zusammensetzung) [61] . Once again, in neither case is there 
any hint that such transformations bear a connection to a more general function 
concept. 

In algebra, Galois' groups consisted of "substitutions" (substitutions) of the 
roots. Cayley's famous 1854 paper, which provided the first axiomatization of 
the group concept, begins as follows: 

Let be a symbol or operation, which may, if we please, have for its 
operand, not a single quantity x, but a system (a;, y, . . .), so that 

0{x,y,...) = {x',y',...), 

where a;', y' ■ ■ are any functions whatever of x,y, . . ., it is not even 
necessary that x' ,y' , . . . should be the same in number with x,y, . . .. 
In particular, x' , y' , &c. may represent a permutation of x, y, &c., 9 
is in this case what is termed a substitution; and if, instead of a set 
x,y, . . ., the operand is a single quantity x, so that Ox — x' — fx, 6 
is an ordinary function symbolic [111 P- 123] 

Notice that although we can assume that each quantity x',y',... is a quan- 
tity of x,y,. . for Cayley, is, in general, an operation and not a function. 
As examples of cases where 9 is a function, he singled out multiplication by 
quaternions and examples arising from the study of elliptic functions. But he 
makes it clear that it is the operations, more generally, that are assumed to 
satisfy the associative law, under composition: 

A symbol 9ip denotes the compound operation, the performance of 
which is equivalent to the performance, first of the operation (p, and 
then of the operation 9; 9(p is of course in general different from ip9. 
But the symbols 9,ip, . . . are in general such that 9.ipx = 9ip.x, &c., 
so that 9ipx, 9ipx^^ "^c. have a definite signification independent of 
the particular mode of compounding the symbols. . . 

Dedekind's later treatment of Galois theory [23 focused on the group of 
automorphisms of a field rather than permutations of roots, but even in this 
context he used the term "substitution" for an invertible map from a field to 
itself, and "permutation" for a substitution that is moreover a homomorphism 
with respect to the field structure. Once again, there seems to be nothing to 
link these substitutions and permutations with the function concept from anal- 
ysis. Even as late as 1895, when Cantor presented his mature theory of the 
infinite [T3] , he described a one-to-one correspondence as a "law of association" 
{Zuordnungsgesetz) between sets, with nothing to suggest that these correspon- 
dences had anything to do with the objects of function theory with which he 
had begun his mathematical career. 

The shift from using the term "function" exclusively for functions defined 
on the continuum to those defined on more general domains was gradual. We 

^''This excerpt is quoted and discussed in Pengelley |84| . 
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have found a very early instance of the phrase "number-theoretic function" 
(zahlentheoretische Funktion) in an 1850 paper by Eisenstein, which begins with 
a notably self-conscious justification for the use of the term: 

Since, with the concept of a function, one moved away from the 
necessity of having an analytic construction, and began to take its 
essence to be a tabular collection of values associated to the values 
of one or several variables, it became possible to take the concept 
to include functions which, due to conditions of an arithmetic na- 
ture, have a determinate sense only when the variables occurring 
in them have integral values, or only for certain value-combinations 
arising from the natural number series. For intermediate values, 
such functions remain indeterminate and arbitrary, or without any 
meaning ^SSj p. 706] 

We might have expected Eisenstein to observe, more simply, that once one starts 
to think of a function as a correspondence between input and output values, it 
is reasonable to transfer the notion to correspondences between domains other 
than the real or complex numbers, described by means other than an analytic 
expression. Instead, he takes the surprising tack of viewing number-theoretic 
functions as partial Junctions on the real numbers. This is a nice illustration of 
the fact that concepts are often stretched and transformed in suprising ways, a 
theme that is the central study of Mark Wilson's book. Wandering Significance 
|102| . In any case, the passage makes it clear that the modern notion of a 
function defined on arbitrary domains was far from the mid-nineteenth century 
mindset. 

We have already noted that in 1854, Cayley considered multiplication by 
a quaternion as a function. This seems to fit with the view of the natural 
progression from real numbers to complex numbers to quaternions, as gener- 
alizations of the concept of magnitude. One finds another hint of expansion 
in the title of Dedekind's 1854 Hahilitation lecture, "On the introduction of 
new functions in mathematics" ( "Uber die Einfiihrung neuer Funktionen in der 
Mathematik" ) [5^ . In this lecture, Dedekind discussed the way that the domain 
of natural numbers was gradually expanded to include the integers, real num- 
bers, and complex numbers, while extending and preserving the properties of 
familiar operations like addition, and division. But the evocative use of the word 
"function" in the title is tempered by the contents of the lecture itself, where 
the word "operation" is used exclusively when it comes for functions defined on 
the natural numbers and integers. 

^■^ "Seit man bei dem BegrifTe der Funktion von der Nothwendigkeit der analytischen Zusam- 
mensetzung abgehend, das Wesen derselben in die tabellarische Zusammenstellung einer Reihe 
von zugehorigen Werthen mit den Werthen des order der (mehrerer) Variabeln zu setzen anf- 
ing, war es moglich, auch solche Funktionen unter diesen Begrifl mit aufzunehmen, welche aus 
Bedingungen aritlimetischer Natur entspringend nur fiir ganze Wertlie oder nur fiir gewisse 
aus der natiirlichen Zahlenreihe hervorgehende Werthe und Wertli-Combinationen der in ilinen 
vorkommenden Variabeln einen bestimmten Sinn erlialten, walirend sie fiir die Zwischenwerthe 
entweder unbestimmt und willkiirlicii oder ohne alle Bedeutung bleiben." We are grateful to 
Wilfried Sieg for help with the translation. 
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To our knowledge, it is not until 1879 that one finds the word "function" 
used with any approximation to the modern sense; and then, in that year, it 
occurred in two remarkable sources. The first is the work of Dedekind: in 
1879, in a supplement dealing with quadratic forms in the third edition of his 
presentation of Dirichlet's lectures on number theory he defined the general 
notion of a character on a class group (that is, a function which maps equivalence 
classes of ideals in an algebraic number field to the complex numbers) : 

. . . the function also possesses the property that it takes the 
same value on all ideals a belonging to the same class A; this value is 
therefore appropriately denoted by x(^) ^-nd is clearly always an hth 
root of unity. Such functions Xj which in an extended sense can be 
termed characters, always exist; and indeed it follows easily from the 
theorems mentioned at the conclusion of §149 that the class number 
h is also the number of all distinct characters xi,X2, ■ ■ ■ ,Xh and that 
every class A is completely characterized, i.e. is distinguished from 
all other classes, by the h values Xi(^)j X^iA), . . . , YhfAlF^ 

By 1882, Weber [lOOj described characters on arbitrary groups as "functions. 'f^ 
In his foundational essay Was sind und was sollen die Zahlen 22 of 1888, 
Dedekind considered arbitrary mappings {Ahhildung) between domains, but, 
curiously, seems to distinguish mappings from functions 

The other landmark source of 1879 is Frege's Begriffsshrift [43j. Here Frege 
defined the notion of a function as follows: 

If, in an expression (whose content need not be a judgeable content), 
a simple or complex symbol occurs in one or more places, and we 
think of it as replaceable at all or some of its occurrences by an- 
other symbol (but everywhere by the same symbol), then we call 
the part of the expression that on this occasion appears invariant 
the function, and the replaceable part its argumentllfl [131 §9] 

". . . die Function xC") ausser der Eigenschaft . . . noch die andere besitzt, fiir alle derselben 
Classe A angehorenden Ideale a denselben Werth anzunehmen, welcher mithin zweckmassig 
durch x(j4) bezeichnet wird und ofTenbar immer eine h^" Wurzel der Einheit ist. Solche 
Functionen x, die man im erweiterten Sinn Charaktere nennen kann, existiren immer, und 
zwar geht aus den am Schlusse des §149 erwahnten Satzen leicht hervor, dass die Classenanzahl 
h zugleich die Anzahl aller verschiedenen Charaktere Xli X2, ■ • • , Xh i^t, und dass jede Classe 
A durch die ihr entsprechenden h Werthe xi{A), X2{A), . . . , Xhi^) voUstandig charakterisirt, 
d.h. von alien anderen Classen unterschieden wird." The quotation appears in §178 in the 
1879 edition of the Vorlesungen [29J, and in §184 of the 1894 edition, which is reproduced in 
Dedekind's Werke |24l . The translation above is by Hawkins _56, p. 149]. 

^^We discuss Weber's 1882 paper, and provide more background on the history of characters, 
in Section l7.1l below. 

^^See §135 in [22) . The relationship between the two notions will be discussed in forthcoming 
work by Wilfried Sieg and the second author. 

18 "Wenn in einem Ausdrucke, dessen Inhalt nicht beurtheilbar zu sein braucht, ein einfaches 
oder zusammengesetztes Zeichen an einer oder an mehren Stellen vorkommt, und wir denken 
es an alien oder einigen dieser Stellen durch Anderes, iiberall aber durch Dasselbe ersetzbar, 
so nennen wir den hierbei unveranderlich erscheinenden Theil des Ausdruckes Function, den 
ersetzbaren ihr Argument." 
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While this definition is couched in terms of expressions, Frege later formulated 
an account that does not depend on linguistic apparatus. We will consider this 
account in more detail in Sections and [TUl but make a few brief remarks here. 
In 1891, Frege noted that his conception of function extended previous ones in 
two ways: by enlarging the collections of signs that could be used to construct 
a functional expression, and by enlarging the domain of possible arguments for 
functions |^ Regarding the first extension, Frege allowed signs such as the 
equality symbol to occur in functional expressions, thus allowing "x^ + x'^ + 
X + 1 = 0" to be classified as a functional expression. Regarding the second 
extension, Frege wrote; 

Not merely numbers, but objects in general, are now admissible; and 
here persons must assuredly be counted as objectsl^ [IHl P- 17] 

The syntax of Frege's logical language allows, moreover, for higher-order func- 
tional, which is to say, functions which take elements of a domain of functions 
as arguments. Although Frege was influenced by Riemann and developments 
in the theory of complex functions [95! , we will see in Section [S] that some 
features of his treatment of the function concept distinguish it from the con- 
temporary mathematical notion. Nonetheless, in the context of the history we 
have sketched here, Frege's dramatic expansion was novel and bold. 

Today, we are apt to look back at the history and wonder what mathemati- 
cians before Dedekind and Frege were missing, that is, why they didn't realize 
the extent to which they could simplify terminology and notation and eliminate 
conceptual clutter. After all, all they had to do was write f : A ^ B to denote 
a function / between two arbitrary domains, A and B, and recognize sequences, 
number-theoretic functions, permutations, and transformations as instances of 
such. But this seems to us to be the wrong question to ask. A better one is this: 
why would anyone want to view these patently different entities as instances of 
a single concept? In practice, they are described in very different ways: ana- 
lytic functions were typically given by piecewise analytic expressions; number 
theoretic functions were given by implicit algorithms, such as counting the num- 
bers less than and relatively prime to some number; Galois' substitutions were 
given by explicit lists; early geometric transformations were given by geometric 
constructions. Moreover, the different kinds of objects support very different 
operations: functions in analysis can sometimes be composed, but sequences 
and series cannot; substitutions (and Cayley's "operations") can always be in- 
verted, whereas number-theoretic functions generally cannot; one can sum an 
expression over all the permutations of a finite set, while one certainly cannot 
sum an expression over all the functions from R to M; notions of continuity and 
differentiability make no sense outside the realm of analysis; and so on. 

As noted in Section [51 there are startup costs involved in fashioning coherent 
means of dealing with new abstract objects, and significant concerns. Thus one 

i^See 8, pp. 137 and 140]. 

^""Es sind nicht mehr bloB Zahlen zuzulassen, sondern Gegenstande iiborhaupt, wobei ich 
allerdings auch Personen zu den Gegenstanden rechnen mufi." 
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should not expect to see an investment in the unification of the function concept 
until there was either a pressing need or clear benefits to be had. Our study 
below will suggest that some of these benefits stem from the uniform treatment 
of algebraic structures composed of functions. For example, the set of auto- 
morphisms of a structure form a group under composition; but there is little 
reason to recognize permutations, geometric transformations, or automorphisms 
of a field as such until one has particularly useful things to say about groups of 
automorphisms in general. Similarly, one can view a number of common math- 
ematical constructions as quotienting by the kernel of a suitable function, but 
there is no point to making the effort to recognize them as such until one real- 
izes that substantial things can be said about the commonality. Alternatively, 
it is also possible that the unification of the function concept is best viewed 
as a by-product of the drive towards subsuming mathematical objects under a 
single unifying foundation, a move which, in turn, might have been encouraged 
by other mathematical needs. 

It is beyond the scope of our study to speculate more than this as to the 
factors that encouraged the development of a general concept of function and 
led to its gradual acceptance. But noting the absence of an overarching func- 
tion concept in the early nineteenth century serves to clarify the nature of the 
project here. When we consider the history of number-theoretic characters in 
proofs of Dirichlet's theorem, we are studying the evolving mathematical treat- 
ment of what we now take to be certain kinds of functions, in the hopes that 
understanding the forces that guided that evolution will help illuminate the 
forces that shaped the evolution of modern mathematics more generally. In 
doing so, we will focus on the way characters were used and the properties that 
were ascribed to them, while, for the most part, steering clear of the question 
as to how and why they came to be viewed as instances of the general function 
concept. 

3.2 Other aspects of the evolution of the function concept 

We have seen that it was not until the latter half of the nineteenth century that 
there was any hint of a general notion of a function between any two domains, 
and that it was a while before what we currently take to be instances of the 
function concept were subsumed under such a general notion. Let us label this 
trend the generalization (or, perhaps, unification) of the function concept. The 
point of this section is to note that this is not the only sense in which the 
function concept evolved over the course of the century. 

To start with, there was also the extensionalization of the function concept. 
Recall that, in Euler's time, a function from the real numbers to the real num- 
bers was expected to have a representation in terms of a certain kind of analytic 
expression. Arguments involving functions could then presuppose such a repre- 
sentation. Precisely this feature of Cauchy's treatment of Fourier series was the 
target of Dirichlet's criticism in the opening paragraphs of his 1829 paper [25] 
on that same subject. Cauchy, like Dirichlet, was concerned with the question 
as to when the Fourier series of a real- valued function converges to the value of 
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the original function at a given point. But Cauchy's analysis presupposed that 
the function in question is represented by a power series, which could then be 
applied to complex arguments. The use of the phrase "arbitrary function" in 
Dirichlet's title signaled his intention to lift this restriction. Even though the 
paper dealt primarily with continuous functions, the hypothesis of continuity 
was expressed purely extensionally, which is to say, in terms of the values that a 
function takes at its arguments. In other words, Dirichlet took great pains not to 
treat analytic operations like differentiation as symbolic operations, but, rather, 
as operations on the functions themselves, viewed extensionally. As Dirichlet 
famously emphasized, this renders the concept of function open-ended, in that 
all that is need is some determinate relationship between input and output 
values. In particular, this made it possible for him to consider, as a difficult 
example, functions which take one value on the rationals, and another value on 
the irrational numbers. 

This brings us to another aspect of the evolution of the notion of a function, 
which we will call the liberalization of the function concept. Setting aside issues 
of unification and generality, there is also the issue of the language and methods 
one helps oneself to in defining functions in a particular domain. As Frege 
later notedly the very act of accepting a definition by cases on whether an 
argument is rational or not was a striking move on Dirichlet's part. This paves 
the way to the use of more dramatically non-constructive methods in describing 
functions from the reals to the reals, say, in terms of limits and other infinitary 
operations. Riemann's contributions to function theory were especially novel 
in this regard. Some, like Dedekind, hailed the fact that Riemann's methods 
made it possible to consider functions without having a particular representation 
to work with, or method of computation. Others, like Weierstrass, held the 
Riemannian approach to be defective for just this reason!^ 

Finally, there is the issue of the reification of the function concept, which 
is to say, the decision to treat functions are objects, on par with mathematical 
objects like the natural numbers. In Section [SJ we will have a lot more to say 
about what this amounts to. 

We have already indicated that we will have little more to say about the 
generalization of the function concept. And, since characters are fairly simple 
combinatorial objects, we can avoid issues having to do with liberal uses of 
the infinite. Wc will be quite concerned, however, with issues regarding the 
reification of characters, and their treatment as extensional objects. Thus our 
case study considers some important aspects of the evolution of the function 
concept, but ignores others, as well as the broader question as to how all the 
various components are related to one another. 

^^See the discussion in Section [S] 

■^^See Bottazzini and Gray |52) . For an interesting exploration of the ways that nineteenth 
century analysis expanded to incorporate a more liberal understanding of the function concept, 
see Chorlay il6j . 
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4 Dirichlet's theorem 



Two integers, m and k, are said to be relatively prime, or coprime, if they have 
no common factor. In 1837, Dirichlet proved the foUowing: 

Theorem 4.1. If m and k are relatively prime, the arithmetic progression 
m,m + k,m + 2k, . . . contains infinitely many primes. 

In other words, if m and k are relatively prime, there are infinitely many 
primes congruent to m modulo k. Dirichlet pointed out that Legendre assumed 
this fact, without proof, in 1788 |70| . when proving the law of quadratic reci- 
procity. In his Disquistiones Arithmeticae, Gauss noted this gap in Legendre's 
work after presenting his own proof of the law of quadratic reciprocity, one 
which does not rely on Theorem But nor did Gauss ever provide a proof 

of that theorem. Dirichlet's own proof is striking, not only due to the fact that 
it finally established Legendre's conjecture, but also due to its sophisticated use 
of the methods of analysis in establishing a purely number-theoretic assertion. 
Dirichlet noted [271, pp. 309-310] that his method was inspired by a proof due 
to Euler ^36. Chapter XV] that there are infinitely many primes, though Dirich- 
let's ideas go considerably beyond Euler's own. In this section, we will describe 
Euler's proof, and then define the modern notion of a group-theoretic character, 
which supports the generalization to arbitrary characters. We will then describe 
contemporary proofs of Dirichlet's theorem, using modern terminology and no- 
tation, as presented in textbooks such as Everest and Ward [38l pp. 207-224]. In 
Section [51 we will use this presentation to frame our discussion of the historical 
development, which occurs in Sections [51 and [7| 

There is a sense, however, in which our modern presentation is old-fashioned. 
The use of m and k for the initial value and common difference in the statement 
of Theorem 14. II is due to Dirichlet, and was picked up by a number of successive 
authors, including Dedekind and Hadamard. Modern presentations are more apt 
to use a and d, but although the use of m and k may feel alien to readers familiar 
with contemporary textbook proofs, it will facilitate the historical comparisons 
later on. (Our reason for using the variable q to range over the prime numbers 
will similarly become clear when we discuss Dirichlet's original proof.) 

4.1 Euler's proof that there are infinitely many primes 

In the Elements, Euclid proved that there are infinitely many primes, but his 
proof does not provide much information about how they are distributed. Euler, 
in his Introductio in analysin infinitorum [36) . proved the following: 

Theorem 4.2. The series ^ diverges, where the sum is over all primes q. 

This implies that there are infinitely many primes, but also says something 
more about their density. For example, since we know that the series ^„ is 

^■'See Gauss' remarks in [50] §150]. 
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convergent, it tells us that, in a sense, there are "more" primes than there are 
squares. 

Euler's proof of Theorem 14.21 centers around his famous zeta function. 



n 

n=l 



defined for a real variable s. (The zeta function was later extended by Riemann 
to the entire complex plane via analytic continuation.) It is not hard to show 
that the series ({s) converges uniformly on the interval [a, cxd), where a is any 
number strictly greater than 1. For s > 1, the infinite sum can also be expressed 
as an infinite product: 



n=l 



where the product is over all primes q. This is known as the Euler product 
formula. Roughly, this holds because we can write each term of the product as 
the sum of a geometric series. 



and then expand the product into a sum. The unique factorization theorem tells 
us that every integer n > 1 can be written as a product ql^ ■ 1^2 ' ' ' ' ^^'^ 
the term = qi^^^ ■ (/^'^^ • ■ • q~k^'°^ will occur exactly once in the expansion. 
Since we are dealing with infinite sums and products, the Euler product formula 
implicitly makes a statement about limits, and some care is necessary to make 
the argument precise; but this is not hard to do. 

If we take the logarithm of each side of the product formula and appeal to 
properties of the logarithm function, we obtain 



oo . ^ 

log^n-^ = ^-logh - - 

n—l q ^ 

Using the Taylor series expansion 



q" 



log(l -x)^ -X- x^/2 - x^/2, 
and changing the order of summations yields 



Ti=l q n=2 q 

At this stage, keep in mind that we want to show that ^ diverges, and notice 
that the first term on the right hand side of the above equation is J^q ^ • Thus 
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wc should consider what happens as s tends to 1 from above. One can show 
that the second term on the right-hand side is bounded by a constant that is 
independent of s, a fact that can be expressed using "big O" notation as follows: 

n=l q 

As s approaches 1 from above, the left-hand side clearly tends to infinity. Thus, 
the right-hand side, ^ > must also tend to infinity, which implies that | 
diverges. 

4.2 Group characters 

Just as Euler's proof shows that there are infinitely many primes by establishing 
that the series ^ is divergent, proofs of Dirichlet's theorem establish that 
there are infinitely many primes q congruent to m modulo k by showing that 
the series X)q=^ (mod k) \ divergent. Remember that we are assuming that m 
and k are relatively prime. The residues modulo k that are, moreover, relatively 
prime to k form a multiplicative group. To adapt the argument above, we need 
series that are more refined than the zeta function, and a device that enables 
us to focus attention on the integers that are congruent to m modulo k. This 
is where the notion of a group-theoretic character comes in. 

Let G be a finite abelian group, written multiplicatively, with an identity 
denoted by 1. A character x on G is a homomorphism from G to the nonzero 
complex numbers, C*. In other words, a character x is a nonzero function 
satisfying x{9W2) = x{9i)x{92) for every gi and §2 in G. Since x(l) = x(l • 1) = 
x(l)x(l) and x(l) is nonzero, we have x(l) = 1- Since G is a finite group, for 
every g there is a least n > 1 satisfying = 1; this n is called the order of 
g and denoted o{g). The fact that x(5)°^®^ = xlff"^®^) = x(l) = 1 means that 
X{g) is a "root of unity" for every g in G. The character which is equal to 1 for 
every 17 in G is called the trivial character and denoted by xo- 

Define the product x ■ V' of two characters pointwise, by 

(X • V')(5) = x(ff)V'(ff), 

for every g in G. This multiplication is commutative, and we have X ' Xo X 
for every character X; which is to say, xo is a multiplicative identity. Recall 
that if w is any complex root of unity, its complex conjugate, w, is also a root 
of unity, satisfying ujuj = 1. We can also lift the operation of conjugation to 
characters, defining x by the equation x{9) = x{9) for each g. Then clearly we 
have X ■ X = Xo- In other words, the set of characters on G forms an abelian 
group, with x~^ = X- We will denote this group G. 
The following theorem is fundamental: 

Theorem 4.3. If G is any finite abelian group, G is isomorphic to G. In 

particular, \G\ = \G\. 
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To sec this, first consider the case where G = (g) — {l,g,g^, . . . ,g"~^} is a 
cycHc group generated by an element g of order n. Then any character x on 
G has to map g to an nth root of unity, u. This determines the behavior of 
X completely, since then we have x(.g*) = cj* for every i. Thus we can let x^j 
denote the unique character that maps g to co. But now if we let w = e^'^*/", 
then CO is what is known as a "primitive root of unity," which is to say that all 
the roots of unity are given by I.lj.uj'^, . . . , w""^. Notice that these roots form 
a multiplicative group that is isomorphic to G. It is easy to verify that the map 
which sends the element 5' of G to the character Xu^ of G is an isomorphism. 

In the more general case, we appeal to the structure theorem for finite 
abelian groups, which says that any such group G can be written as a prod- 
uct Gi X . . . X G/; of cyclic groups. This means that every element g of G 
can be written uniquely as a product g = 51(72 • • ■ gk, where each gi is in Gi. 
Given characters XijX2, ■ ■ ■ ,Xk on Gi, G2, . . . , respectively, one gets a char- 
acter X on G defined by xig) = Xi(fl'i)X2(S'2) • • ■ Xk{9k), where g = 5152 ■■■9k 
is the decomposition described above. Moreover, it is not hard to show that 
every character on G arises in this way. This shows that G is isomorphic to 
Gi X G2 X . . . X Gfc. By the analysis of the cyclic case, the latter is, in turn, 
isomorphic to Gi x G2 x . . . x G^ , and hence to G. 

The following theorem expresses two important properties, known as the 
"orthogonality relations" for group characters. 

Theorem 4.4. Let G be finite abelian group. Then for any character x in G, 
we have 

x{g) = 

and for any element g of G, we have 

E xig) = 

xed 

The first equation clearly holds when x is the trivial character, xoi since, 
in this case, each term of the sum is equal to 1. Otherwise, pick h such that 
X{h) 1 and note 

xih) E xig) = E xihg) = Yl xig)- 

geG gGG geG 

Since ^ 1, we must have ^^g^cXig) = 0. The second equation can be 

established in a similar way. 

The orthogonality relations make it possible to do "finite Fourier analysis," 
in the following sense: if / is any function from G to the complex numbers and 
we define the "Fourier transform" / of / by f{x) = J2g fig)x{g)i then / can 
be recovered from its Fourier transform: / = fix)x^ This shows, in 

particular, that any function from G to the complex numbers can be written 



\G\ ifx^Xo 
ifx^Xo, 



\G\ ifg = lG 
^/3 7^lG. 
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as a linear combination of characters. The second orthogonaUty relation also 
provides the following useful corollary: 

Corollary 4.5. For any g,h G G we have the following: 



Xi9)x{h) 

xe G 



\G\ ifg = h 

ifg^h 



This follows from the fact that we have 

xeG xeG xe(? 



|G| a g = h 
if g^h 



This corollary will enable us to focus on the residue class of m modulo k in the 
proof of Dirichlet's theorem. 



4.3 Dirichlet characters and L-series 

Let k be an integer greater than or equal to 1. It is a fundamental theorem of 
number theory that an integer n is relatively prime to k if and only if n has 
a multiplicative inverse modulo k; in other words, if and only if there is some 
n' such that nn' = 1 mod k. This implies that the residue classes of integers 
modulo k that are relatively prime to k form a group, denoted (Z/A;Z)*, with 
multiplication modulo k. The cardinality of (Z/fcZ)*, that is, the number of 
residues relatively prime to k, is denoted <^(fc), and the function tp is called the 
Euler (fi-function. 

Any character % on (Z/fcZ)* can be "lifted" to a function X from Z to C 
defined by 

-^i ) I ''^^^ ^'^'^ ^) " ^® relatively prime to k 
1 otherwise. 

Such a function is called a Dirichlet character modulo k. Dirichlet characters are 
completely multiplicative, which is to say, X(rnn) = X{m)X{n) for every m and 
n in Z. Mathematicians typically use the symbol x to range over Dirichlet char- 
acters, blurring the distinction between such functions and their group-character 
counterparts. This is harmless, since there is a one-to-one correspondence be- 
tween the two, and so we will adopt this practice as well. 

We can now generalize the method of Euler's proof. Roughly speaking, we 
need a variant of the zeta function that will allow us to focus on primes in a 
particular residue class modulo k. To that end, given a Dirichlet x character 
modulo k, define the Dirichlet L-function, or L-series, 

n=l 
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where s is any complex number. This formal series will converge whenever 
lH(s) > 1, that is, the real part of s is greater than one|ff| And just as ^ 
can be written as a product via the Euler product formula, so each L{s,x) = 
XlJ^i ^^7^ ^ useful product expansion. 

Theorem 4.6. Let x be a Dirichlet character modulo k. Then the L-function 
associated with x has an Euler product expansion for S\{s) > 1, 

n— 1 q 

The last identity follows from the fact that, in the product, we can ignore 
those primes q that divide fc, since xil) = for such q. With the product 
formula in place, we can sketch a proof of Dirichlet's theorem. 




4.4 Contemporary proofs of Dirichlet's theorem 

Recall that we want to prove that there are infinitely many primes q such that 
q = m (mod fc), where m and k are relatively prime. As in the proof that 
^ diverges, we begin by taking logarithms of both sides of the Euler product 
expansion for L(s,x), where x is a Dirichlet character modulo q: 

logL(.,x) = -El°g(l-^ 

As before, we make use of the Taylor series expansion for the logarithm on the 
right hand side of the above equation to obtain: 



log L{s,x) = J2J2~ 



1 x{q- 



7 f/*-' 

Z-^ qS j „sj 

q\k ^ q\k,]=2 

One can show that the second term in the expression is bounded by a constant 
that is independent of s and Xi which can be expressed as follows: 

\ogL{s,x) =Y.^ + 0(1). 

q\k ^ 



Now comes the crucial use of Corollary [475] to pick out the primes in the relevant 
residue class. We multiply each side of the above equation by x(?Ti) and then 
take the sum of these over all the Dirichlet characters modulo k. (Recall that we 



^■^By analytic continuation, each of these functions except for L(s,xo) can be extended to 
an analytic function on the domain {s | 5H(s) > }. 
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can identify each character Dirichlct with the corresponding group character, 
that is, the corresponding element of (Z/fcZ)*.) Thus we have: 

To simphfy this expression, we exchange the summations on the right-hand side, 
and appeal to Corollarv l4.5l Since the cardinality of the group (Z/fcZ)* is (p{k), 
we obtain 

J2 w^iogL{s,x) = m E ^ + 

This is analogous to the equation ([T]) in Euler's proof. Our goal is once again to 
show that the left-hand side tends to infinity as s approaches 1 from above; this 
implies that the right-hand side tend to infinity, which, in turn, implies that 
there infinitely many primes q that are congruent to m modulo k. However, 
now the left-hand side is considerably more complicated than the expression 
log S^^i -h Euler's proof. 

To show that Yl^^^fp^y x(™) logi(s, x) tends to infinity a s approaches 1, 
we divide the characters into three classes, as follows: 

1. The first class contains only the principal character xo^ which takes the 
value of 1 for all arguments that are relatively prime to fc, and otherwise. 

2. The second class consists of all those characters which take only real values 
(i.e. or ±1), other than the principal character. 

3. The third class consists of those characters which take at least one complex 
value. 

It is not difficult to show that L(s, xo) has a simple pole at s = 1, which implies 
that the term xo(m) log i(s, xo) approaches infinity as s approaches 1. The real 
work involves showing that for all the other characters x^ L{s, x) has a finite 
nonzero limit. This implies that the other terms in the sum approach a finite 
limit, and so the entire sum approaches infinity. 

For characters in the third class, that is, the characters that take on at least 
one complex value, the result is not difficult. For characters in the second class, 
the result is much harder, and Dirichlet used deep techniques from the theory of 
quadratic forms to obtain it. In the years that followed, other mathematicians 
found alternative, and simpler, ways of handling this case. But even in modern 
presentations, this case remains the most substantial and technically involved 
part of the proof. 

Our presentation has been thoroughly "modern." In the next section, we 
will consider some of the methodological features of the proof that make it so. 
This will enable us to draw interesting contrasts with Dirichlet 's proof, and then 
explore the way that presentations of Dirichlet 's theorem gradually took on such 
a modern character. 
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5 Modern aspects of contemporary proofs 



5.1 The hermeneutics of proof 

Although our presentation uses contemporary terminology and notation, there 
is a sense in which it is a faithful description of Dirichlet's 1837 proof. Dirichlet 
did not, and could not, rely on a general notion of group character, as the 
general notion of a group was not articulated before Cayley did so in 1854 (TS], 
and was not brought into general currency until Kronecker's 1870 paper |63j . 
which first presented the structure theorem for finite abelian groups cj Another 
important difference is that even though Dirichlet's argument used complex 
numbers in a central way, there was less established background in complex 
analysis than is available today, and Dirichlet tended to reduce the calculations 
to real analysis whenever possible. Thus his variable s ranged over real numbers, 
and his calculations involve real-valued sines and cosines where today we are 
comfortable sticking with the complex exponential. In addition, we have already 
noted that many of the technical details were streamlined over the years. Despite 
all this, the outline above characterizes the central ideas of his proof, and most 
mathematicians would not find it unreasonable to say that that is, essentially, 
how he obtained the result. 

But, as we will see in Section |6l there is one very striking difference: in 
Dirichlet's original presentations, there is no mention of characters at all. That 
is, Dirichlet's papers contain certain expressions that we now recognize as values 
of the various characters, and summations that are tantamount to summing 
over all the characters. But the characters themselves are only objects that we 
project back into the argument from our current understanding. They hover 
over the page as shade-like premonitions, ghosts of mathematics yet to come. 

In Section [71 we will consider presentations of Dirichlet's theorem given by 
Dedekind, de la Vallee-Poussin, Hadamard, Kronecker, and Landau, and see 
how the characters were gradually brought to life. We will see that many of 
the benefits of giving characters a substantive embodiment are notational and 
pragmatic, but that is not to say that they are merely notational and pragmatic: 
treating characters as bona fide objects comes with serious mathematical con- 
straints and obligations, and provides conceptual reorientations that have great 
bearing on the kind of mathematics we do, and the way we do it. We will argue 
that the reification of the notion of a character is a prototypical instance of the 
conceptual changes that are hallmarks of the transition to "modern" mathe- 
matical thought, and that understanding how and why the changes came about 
shed light on the way we do mathematics today. 

But the observations above present us with a terminological conundrum: 
should we describe the various historical texts as "versions" or "presentations" 
of Dirichlet's proof, or different proofs entirely? Having raised this issue, we 
will, for the most part, set it aside, and be fairly cavalier with our terminology. 
Since our specific concern is to study the way that language, conceptualization, 
and inferential practice evolved over the years, and the effects that had on the 

^^See Wussing |106) for the history of group theory. 
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mathematics, we need not explicitly address the question as to when it is proper 
to consider two proofs essentially the same or essentially different. 

Talking about historical texts in modern terms is difficult, and it is always 
misleading to portray the history of mathematics as a muddled and inefficient 
attempt to arrive at the contemporary enlightened view. We hope we have not 
fallen into this trap. If there is anything that deserves to be treated as a rational 
pursuit, mathematics should count as such, and so one would expect there to 
be good reasons that we do mathematics the way we do. At the same time, 
there are often good reasons to question the way we do mathematics today, 
and recognize there are tradeoffs involved in the historical decisions that were 
made. Comparing mathematical texts from different historical eras and trying 
to understand what has changed provides with a fruitful way of understand- 
ing the values that drive mathematical change. But it is often easier to draw 
contrasts by starting with the mathematics with which we are most familiar, 
and so, having discussed contemporary approaches to Dirichlet's theorem, let 
us foreshadow some of the contrasts we wish to consider. 

5.2 Characters as objects 

In Section |31 we noted that characters (whether we refer to group characters, 
or Dirichlet characters) are instances of the contemporary function concept, a 
concept which evolved significantly over the course of the nineteenth century. 
These are some of the salient features of the treatment of characters in our 
modern presentation: 

1. Group characters are given an abstract, axiomatic definition as functions 
that satisfy the homomorphism property, and the Dirichlet characters are 
introduced as a natural extension of the notion. 

2. In particular, one defines the set of characters modulo k extensionally. 
Only later does one show that this set is finite, and provide explicit ways 
of describing and enumerating them. 

3. Characters are studied in their own right, and their general properties are 
enunciated in propositions and theorems. 

4. One sums over sets of characters, without needing representations for any 
particular one. More, generally, one characterizes operations on characters 
(such as the product of two characters) extensionally, and not in terms of 
their representations. 

5. Characters appear as arguments to other functions, namely, the L-functions. 
In particular, it is clear from the definition that L{s,x) depends only on 
the extension of x- 

6. One defines various sets of characters extensionally, for example, distin- 
guishing the trivial, real, and complex characters in terms of the values 
they assume. More generally, one typically carries out arguments without 
making reference to any particular representation. 
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7. The characters modulo k are viewed as elements of an algebraic structure, 
namely, a group, with multiplication defined pointwise and the trivial 
character serving as identity. 

In Section [51 we will discuss Dirichlet's original proof, and in Section [71 we will 
consider the way the proof was gradually transformed to reflect our contem- 
porary understanding. We will see that proofs along the way possess various 
subsets of the properties just enumerated, and that, in some cases, the authors 
are noticeably squeamish, or at least self-conscious, of these features. Impor- 
tantly, Dirichlet's original proof has none of the properties just enumerated, 
providing a clear contrast to the style of presentation that is common today. 

One way of characterizing the difference between Dirichlet's original proof 
and our modern presentation is to say that, in the latter, characters are treated 
as full-fledged mathematical objects, whereas there are no such objects in Dirich- 
let's version. The claim that over the course of the century characters gradually 
become treated as new sorts of objects supports our contention that the trans- 
formation has metaphysical overtones. But what, exactly, does it mean to say 
that a piece of mathematics sanctions certain entities as "objects"? We do not 
claim to have a precise set of criteria, but the list above provides some insight 
as to potential hallmarks of object-hood. 

To start with, consider the fact that in the modern proof we identify the 
concept of a "character," reason about entities falling under this concept, and 
ascribe various properties to them. This seems to be a bare-minimum require- 
ment to support the claim that a mathematical text sanctions certain entities 
as objects, namely, that it recognizes them as being entities of a certain sort, 
capable of bearing predicates and being the target of certain operations. It 
doesn't matter whether we take this sort as fundamental (for example, as we 
take the notion of "integer" in most contexts) or as derived from a broader sort 
(for example, when we view characters as functions of a certain kind). What 
is important is that the entities belong to a grammatically recognized category, 
and this category helps determine the predicates and operations that can be 
meaningfully ascribed to it. For example, one can talk about one integer being 
divisible by another, but not one character as being divisible by another. In 
sum, the first criterion to look for is whether the entities in question have a 
recognizable role in the grammar of the language. 

The use of the word "representation" in our informal list provides another 
clue, insofar as we generally speak of a representation of something or other. 
We think of expressions like "6" and "2 x 3" as representing an integer. As 
Michael Detlefsen has pointed out to us, one common view is that an "object" 
is what is remains invariant under all the different representations of a thing; 
in other words, what is left over when one has "squeezed out" all the features 
that are contingent on particular representations. When it comes to the notion 
of a function, what is the underlying invariant? There may be lots of ways 
of representing a particular function, but what makes them representations of 
the same function is surely that they take the same values on any given input. 
Thus reasoning about functions extensionally is a sign that one is reasoning 
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about functions as objects, rather than reasoning about their representations @ 
A third hallmark of object-hood that is present in our list is evidenced by 
the fact that we can sum over characters, just as we can sum over natural 
numbers. Notice that in an expression . . . x ■ • ■, the variable x is a bound 
variable that ranges over the entities in question. Similar considerations hold 
for the universal and existential quantifiers. Viewing the natural numbers as 
quintessential mathematical objects, a sign that an entity has attained the status 
of object-hood is that it is possible to quantify over them in theorems and 
definitions, just as one quantifies over the natural numbersl^ The consideration 
admits of degrees: whereas the bare-minimum requirement discussed above may 
allow us to state theorems about, and define operations on, "arbitrary" entities 
of the sort, a more full-blown notion of object-hood will give us more latitude 
in the kinds of quantification and binding that are allowed. 

A fourth criterion for object-hood is evidenced by the fact that characters 
are allowed to appear as arguments to the L-functions, for example, in the 
expression L(s,x). To avoid making this consideration depend on the modern 
notion of a function, let us note that what is essential here that an expression 
denoting a recognized mathematical object (in this case, a complex number) 
is allowed to depend on a character, much the way that a real number (s)i in 
a sequence depends on the value of the index i, or, in Euler's notation, the 
value 7rn depends on n. What makes this more potent than the mere ability to 
define operations on characters is that the dependent expressions are treated as 
objects in their own right. L(s,x) is not just an operation on s and x- fixing 
X, the function s i-> L(s,x) is an object that one can integrate, and summing 
over X) one obtains a function of s. In a similar way, one can form sets and 
sequences of characters, in much the same way that one forms sets and sequences 
of numbers, and one can define a group whose elements are characters, in much 
that same way that one can form a group whose elements are residues modulo 
some number m. 

To summarize, here are some of the various senses in which characters are 
treated as objects in the modern presentation: 

1. Characters fall under a recognized grammatic category, which allows us 
to state things about them and define operations and predicates on them. 

2. There is a clear understanding of what it means for two expressions to 
represent the same character, and conventions ensure that the expressions 
occurring in a proof respect this "sameness." 

3. One can quantify and sum over characters; in short, they can fall under 
the range of a bound variable. 

4. One can define functions which take characters as arguments, and sets 
of characters; indeed, characters can be elements of arbitrary algebraic 
structures. 

^^Recall Quine's dictum that "there is no entity without identity," for example in [87] . 
^^This echoes another Quine dictum, "to be is to be the value of a bound variable" 1891 p. 15]. 
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We recognize that determining the "ontological commitments" of a practice may 
not be as clear-cut as Quine's writings suggest, and we do not claim to have 
given a precise sense to the question as to whether a particular mathematical 
proof is committed to functions as objects. But we do claim to have identified 
various important senses in which contemporary proofs of Dirichlet's theorem 
treat functions as ordinary mathematical objects, whereas Dirichlet's original 
proof does not. 

Below, we will not directly address the question as to whether any of these 
are necessary or sufficient criteria for being a mathematical object. After all, 
our goal is not so much to clarify the notion of a mathematical object as to 
explore the reasons that mathematical language and practice evolved the way 
it did, and the rational considerations that bore upon this evolution. But we 
hope we have made clear that these features of the treatment of characters turn 
on very fundamental aspects of mathematical language and inferential practice, 
and, as such, reflect nontrivial metaphysical commitments. 

6 Dirichlet's original proof 

Wc have already asserted that characters appear only "implicitly" in Dirichlet's 
original proof |28) . There is nothing mysterious about this: what we mean is 
that, in Dirichlet's proof, there are certain symbolic expressions that we now 
recognize as denoting the values of characters; and that, moreover, some of 
Dirichlet's calculations and inferences invoke what we now recognize as general 
properties of characters. 

Let us spell out the details. Like Dirichlet, we will first consider the case 
were the common difference in the arithmetic progression is a prime number, 
denoted by p instead of fc. It was well known in the nineteenth century that one 
can always find a primitive element modulo p, which is to say, an element c, 
such that the p — 1 residues c°, c^, c^, . . . , c^~^ modulo p yields all the nonzero 
resides, 1, 2, ... ,p — 1 (not necessarily in the same order). In modern terms, we 
would say that the group (Z/pZ)* of units modulo p is cyclic, generated by the 
residue class of c. In more elementary terms, this amounts to saying that for 
every integer n, there is a number 7„ with the property that c'"'" = n mod p. 
We saw in Section 14.21 that if x is a Dirichlet character modulo p (that is, x 
corresponds to character on (Z/pZ)*), then x(c) is a pih root of unity, say, w; 
and, moreover, x is entirely determined by w, in the sense that for every n 
relatively prime to p, xC*^) = ■ Thus Dirichlet simply wrote uj'^" where we 
would write x("-)- The notation presupposes that one has fixed a choice of the 
primitive element, c, though any primitive element will work equally well. 

So far, so good. In the more general case where the modulus is a composite 
number k, however, things get more complicated. First, write fc as a product of 
primes, 

fc = 2Vpr •••P? 

where each pi is an odd prime and tt^ is greater than or equal to 1. Then the 
group of units modulo fc is isomorphic to the product of the groups of units 
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modulo each term in the factor. Gauss had ahcady shown if p is an odd prime 
and TT is an integer greater than or equal to 1, then one can more generally 
find a primitive element c modulo p'^. This means that the residue class of 
c generates the cyclic group (Z/p'^Z)*, or, cquivalently. for every n relatively 
prime to p there is a 7^ such that c^" = n mod p^ . Thus we can choose primitive 
elements ci,. . . ,Cj corresponding to p^^ , P2^ , pj-" . If A > 3, however, there 
is no primitive element modulo 2"^. Rather, (Z/2^)* is a product of two cyclic 
groups, and for every n relatively prime to 2^ there are an q„ and /3„ such that 
(■_]^^o;„g/3n = mod 2^. Thus for any n relatively prime to k, we can write 

n = {-iy'"F/-cf -"'cf-"' . . . cj^'" mod k 

where each 7j^„ is the index n relative to pj' . As above, if we choose appropriate 
roots of unity 9,ip,u)i,u)2, ■ ■ ■ ,ojj, we obtain a character 

X(n)=r"^^"<".;^^-...a;j'--. (3) 

And, once again, every character is obtained in this way. We should note that 
Dirichlet used the notation . . . rather thanpi, ... ,pj to denote the sequence 
of odd primes. Moreover, he used the notation a, /3, 7, 7', . . . to denote the 
indices, suppressing the dependence on n. Thus, Dirichlet wrote 9"'(p^co''u!''^ . . . 
for the expression we have denoted xC'^) above, leaving it up to us to keep in 
mind that a, (3, . . . depend on n. 

To summarize, in the simple case of a prime modulus p, Dirichlet fixed a 
primitive element modulo c, and represented each character x in terms of a 
pth root of unity, uj. In that case, the value x(n) is given by uj^" . In the 
more general case of a composite modulus k, Dirichlet fixed primitive elements 
modulo the terms of the prime factorization of k, and represented each character 
X in terms of a sequence 9, (p, tt, tt' of roots of unity. In that case, the value x(n) 
was written 9"'(p^co''io'^ suppressing the information that the exponents 
a, 7, 7', . . . depend on n. 

Recall that our contemporary presentation had little to say about particular 
characters, other than the trivial character, xo- Rather, characters appear as 
arguments to the i- functions, L{s, x), and the proof has us consider summations 
over the set of all characters. Let us now consider how Dirichlet handled these 
as well. 

Again, with Dirichlet, we begin with the easier case where the common 
difference of the arithmetic progression is a prime, p. Recall that, in that case, 
each character x corresponds to a pth root of unity, oj. Dirichlet stated the 
Euler product formula as follows: 

We therefore have the equation 

TT ^-^=yuj^-=L, (4) 

where the multiplication sign ranges over the whole series of primes 
with the sole exception of p, while the summation involves all the 
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integers from 1 to cx) that are not divisible by p. The letter 7 denotes 
7p on the left, and 7„ on the right [28l p. 3] 

Compare this to the statement of Theorem 14.61 above. Since there are p — 1 
distinct p — 1"* roots of unity, Dirichlet continued: 

The equation just found represents p — 1 different equations, which 
are obtained by replacing uj with its p — 1 values. It is known that 
these p — 1 different values can be represented as powers of one such 
O, chosen appropriately, so that the values are then: 

n\ n\ n\ n^-^ 

In accordance with this representation, we will write the different 
values L of the series or product as: 

Lq, Li, L2, . . . , 

. . .0 [Ml p. 3] 

Notice that Dirichlet says that the Euler product formula "represents p — 1 
different equations," rather than thinking of it as a single equation parametrized 
by w. 

In the more general case where the common difference is some composite fc, 
Dirichlet's procedure is completely analogous. First, he demonstrated that the 
Euler product formula holds: 



"Man hat daher die Gleichung: 

n — ^=E'^^-=^' ® 

9 

wo sich die Multiplicationszeichen auf die ganze Reihe der Primzahlen, mit alleiniger Aus- 
nahme von p, erstreckt, wahrend die Summation sich auf alle ganzen Zahlen von 1 bis 00 
bezieht, welche nicht durch p teilbar sind. Der Buchstabe 7 bedeutot auf der ersten Seite 7^, 
auf der zweiten dagegen 7n." |28l pp. 317-318] We have replaced Dirichlet's equation number 
with our own, and throughout this section we have modified the translation cited in |28) . 

"Die eben gefundene Gleichung reprasentirt p — 1 verschiedene Gleichungen, welche man 
erhalt, wenn man fiir u} seine p — 1 Werthe setzt. Bekanntlich lassen sich diese p — 1 ver- 
schiedenen Werthe durch die Potenzen von einem derselben O darstellen, wenn dieser gehorig 
gewahlt wird, und sind dann; 

n'\ n\ Q\ 

Wir werden, dieser Darstellung entsprechend, die verschiedenen Werthe L der Reihe oder des 
Productes mit: 

I/O, Li, L2, ■ ■ ■ , Lp-2 

bezeichnen. . . ." [281 p. 318] 
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where the multiphcation sign ranges over all primes, with the ex- 
clusion of 2,p,p', . . ., and the summation ranges over all the posi- 
tive integers that are not divisible by any of the primes 2,p,p', . . .. 
The system of indices a, /?, 7, 7', . . . on the left side corresponds to 
the number q, and on the right side to the number n. The gen- 
eral equation ([S]), in which the different roots 9, (p,u!,u!' , . . . can be 
combined with one another arbitrarily, clearly contains K-many par- 
ticular equations!!^ [281 p. 17] 

Note, again, Dirichlet's characterization of the general equation as "containing" 
the particular instances. Here, K is what we have called ip{k), the cardinality of 
the group (Z/fcZ)*. Dirichlet went on to note that we can choose primitive roots 
of unity 8, $, il, ft', ... so that all choices of 9, ip, w, w' , . . . can be expressed as 
powers of these, 

9 = e'',ip = ^\uj = n\u;' = n'',..., 

just as in the simpler case. He wrote that we can thus refer to the L-series in a 
"convenient" (bequem) way, as -^a.&.c.c',...i where ci, b, c, c', . . . are the exponents 
of the chosen primitive roots. Notice that the representations just described 
depend on fixed, but arbitrary, choices of the primitive roots of unity, as well 
as fixed but arbitrary generators of the cyclic groups. Modulo those choices, we 
have parameters a, b, c, c', . • ■ that vary to give us all the characters; and for each 
choice of a, b, c, c', . • ■ we have an explicit expression that tells us the value of 
the character at n. For Dirichlet, summing over characters therefore amounted 
to summing over all possible choices of this representing data. 

In the special case of where the common difference is a prime, p, Dirichlet 
ran through calculations similar to those described in Section to obtain the 
following identity: 



^ q~ ^ 2 + 3 ^ q^ ^ "' 

= ^-(logLo + r*-^" logLi + logLs + . . . + logip-2). 

p - 1 



He has essentially arrived at equation ^ in Section which read as follows: 

1 



E 



x(m)logi(s,x) = ^(fc) - + 0(1) m 



Xe(z7fcZ)' 9=™ (mod k) 



n ^- — -, = Ye°'^^u^J^' ... — = L, © 

wo sich das Multiplicationszeichen auf die ganze Reihe der Primzahlen, mit Ausschluss von 
2, p, p' , . . . , und das Summenzeichen auf alle positiven ganzen Zahlen, welche durch keine 
der Primzahlen 2, p, p' , ...theilbar sind, erstreckt. Das System der Indices a, /3, 7, 7', . . . 
entspricht auf der ersten Seite der Zahl q, auf der zweiten Seite der Zahl n. Die allgemeine 
Gleichung in welcher die verschiedenen Wurzeln 6, </», tj, oj', . . . auf irgend eine Weise mit 
einander combinirt werden konnen, enthalt offenbar eine Anzahl K besonderer Gleichungen." 
[281 pp. 336-337] We have replaced Dirichlet's equation number with our own. 
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In the case at hand, k is p, in which case ip{k), the cardinahty of (Z/fcZ)*, is 
equal to p — 1. To facihtate the comparison, switch the right- and- side with the 
left- hand- side of ^ (convention dictates that the term 0(1) stays on the right 
side), divide through by <p{k), and note all of the following. First, Dirichlet used 
1 -|- p in place of s to denote the quantity that approaches 1 from above. Second, 
the sum 



has been absorbed into the term 0(1) in equation ([2]); indeed, Dirichlet's next 
move was to note that this sum is bounded by a constant. Finally, if Xi(m) 
is equal to fl^"'"^, then the complex conjugate Xi(m) is equal to so the 

expression £7"'''''" logL^ would be expressed in our notation as Xi{'i^)L{s, Xi)- 

When Dirichlet considered the more general case, he arrived at the analogous 
result: 




Here the summation on the right hand side of the equation is over the possible 
values of a, b, c, c', Once again, this translates to our equation 

Let us now summarize the differences between Dirichlet's presentation and 
contemporary ones. One salient difference is at the level of algebraic abstraction. 
The contemporary presentation developed the general notion of a character of 
an arbitrary finite abelian group, G, and showed that the characters themselves 
form a group, G, isomorphic to G itself. Dirichlet, in contrast, focused on a 
very particular group, (Z//cZ)*, the multiplicative group of units modulo k. His 
presentation showed an intimate familiarity with the structure of that group, 
and the explicit mapping from that group to the group of characters. These 
details are inessential in the modern presentation. 

But there is another abstraction that distinguishes modern presentations 
from Dirichlet's, namely, the willingness to treat characters as mathematical 
objects in their own right. This is, in part, facilitated by the algebraic ab- 
straction: it is easier and more advantageous to treat characters as objects that 
transcend their representations when there are language and methods available 
that obviate the need to return to particular representations whenever there is 
real work to be done. But the dependence goes both ways: the abstract alge- 
braic methods cannot even be invoked until one is willing to consider characters 
as objects that can bear properties and algebraic structure. 

It is the absence of these two forms of abstraction that puts Dirichlet's pre- 
sentation in stark contrast with the modern one. Dirichlet did not define the 
notion of character at all, let alone general operations on characters; nor did he 
identify any of their general properties. Rather, characters only come into play 
as symbolic expressions in the construction of the L-functions, and their prop- 
erties are derived in an ad-hoc way as the proof proceeds. The characters are 
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thus "intensional" objects: they are represented exphcitly in terms of primitive 
roots of unity and generators of the corresponding residue group, and the only 
way of reasoning about them is in terms of this representing data. 

In Dirichlet's presentation, there is no quantifying over characters. Instead, 
one quantifies over their representations in terms of primitive roots of unity. 
Similarly, there is no direct notion of summing over characters: Dirichlet was 
happy to sum over finite sets of natural numbers and tuples of natural numbers, 
but where we would sum over a finite set of characters, Dirichlet instead summed 
over representations in terms of such tuples. It is notable that he presented the 
Euler product formula in Q and ([5]) in terms of arbitrary primitive roots of 
unity, and then explained that these "represent" or "contain" more particular 
equations which can then be summed over to obtain the desired result. 

Finally, it is worth drawing attention to one further consequence of the 
different treatment of characters in the two presentations, namely, the extent 
to which they make important dependences explicit. Because Dirichlet's ex- 
pressions depend on so much representational data, Dirichlet often suppresses 
details, to keep the expressions from getting unwieldy. Thus, he wrote ui^' , sup- 
pressing the dependence of 7 on n, where we would write x('T-) for the character 
X corresponding to lo. This places a greater burden on the text of the proof, 
and the reader's memory, to keep track of the relevant dependences, and, for 
example, the ranges of a summation. Moreover, the modern notation L(s,x) 
makes it easy to track the dependence on the character x, something that is lost 
in Dirichlet's presentation. Thus in the modern expression x(m)L(s, x) ths role 
of X is clear in both terms in the product; in Dirichlet's expression, r2^*'>''" log Li 
the connection is buried in the definition of Li. 

7 Later presentations 

In this section, we will examine proofs and discussions of Dirichlet's theorem 
by Dedekind |29], de la Vallee-Poussin [UlII^, Hadamard Kronecker [66] . 
and Landau [68tl69] . We will focus on describing the similarities and differences 
between the presentations of characters and L-functions, rather than describing 
each of the presentations in full, since most share the same general structure. 

7.1 Dedekind 

In 1863, Dedekind published a book, Vorlesungen iiber Zahlentheorie, based on 
his notes taken from a course on number theory given by Dirichlet at Gottingen. 
After the lectures, he added nine "supplements," or appendices, with material 
of his own. Supplement VI, in particular, contained a presentation of Dirichlet's 
theorem. Dedekind extended the supplements in the second edition, published 
in 1871, to include his theory of algebraic ideals. That theory was, in turn, 
revised and expanded in the third and fourth editions, which appeared in 1879 
and 1894, respectively. The earlier supplements, however, were not changed 
after the first edition. 
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Dedekind began his presentation with an overview of main steps of the proof. 
In particular, he proved the Euler product formula, and obtained the series 
expansions for the i-functions and their logarithms. But the overview deals 
with a more general class of L-functions than we have considered so far: 

The general proof of . . . [Dirichlet's theorem] is based on the consid- 
eration of a class of infinite series of the form 

L = ^V(n), 

where n runs through all positive integers and the real or complex 
function V'l"-) satisfies the condition 

. . . [and] we always assume that ipil) — §132] 

He then went on to focus his attention on Dirichlet characters. He used the 
same expressions as Dirichlet to represent the characters, but whereas Dirichlet 
expressed each root of unity uj involved in the construction as the power £7* of 
a single primitive root of unity, Dedekind did not bother with this step. He did 
not explicitly refer to the relevant expressions as "characters," but he introduce 
the notation xl"-) to denote their values, and pointed out that x is completely 
multiplicative: 

The numerator [of ^pin)] xin-) — O^rfio'^uj''^ . . . has the characteris- 
tic property x(")x("'') = xin-n-') ■ ■ O EH §133, footnote] 

Nonetheless, like Dirichlet, Dedekind showed a reluctance to quantify over 
characters directly. He introduced the L-functions as 

specifying that the product is to be taken over all primes not dividing the com- 
mon difference, and the sum is to be taken over all natural numbers relatively 
prime to the common difference. Dedekind went on to note 

. . . that the series can exhibit quite different behavior, depending on 
the roots of unity 0, rj, uj, uj' ^ . . . appearing in the expression for ip{n). 

"Der allgemeine Beweis dieses Satzes . . . stiitzt sich auf die Bctraclitung einer Classe von 
unendlichon Reiiien von der Form 

wo der Buclistabe n alle ganzen positivon Zahlon durchlaufon muss, und die reelle oder com- 
plexe Function i/'{n) der Bedingung 

geniigt ... so nehmen wir immer an, dass tpi^) = 1 ist." 

•^■^"Der Zahler xi^) = 9°' r]^ uj~< uj''' ... besitzt die cliarakteristisclien Eigensciiaften 
X(n)x("') = X(""') • ■ •" 
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Since these roots can have a, b, c, c', . • • values, respectively, the form 
L contains altogether 

abcc' . . . = (p{k) 
different particular series. . [55, §133] 

Here, the use of the word "contains" is strongly reminiscent of Dirichlet's lan- 
guage. 

Dedekind then proceeded to divide the L-functions into classes. Given that 
the characters are defined in terms of a sequence of roots of unity, there are 
three distinct possibilities that can obtain for a given series L: 

1. The roots of unity in the construction of the character occurring in L are 
all 1. There is only one such i-function, which is denoted Li. 

2. The roots of unity in the construction of the character occurring in L are 
all real, i.e. are all ±1. L-functions that fall into this category are written 
as L2. 

3. At least one of the roots of unity in the construction of the character 
occurring in L is imaginary, i-functions that fall into this category are 
written as L^. 

Moreover, each L-function that falls into the third category has a conjugate: if 
L3 = ^ — ^ '^^3'^ — then there is a corresponding L-series in the same class, 

denoted by L(„ such that L'^ = ^"°""''";7'^'~^'- ■ 

Let us now see how Dedekind's notation plays out when it comes to summing 
over the characters. When proving that the L-functions corresponding to a 
complex character have a finite non-zero value as s tends to infinity, he obtained 
the following equation: 



= logLi+^log(L2) + ^log(L3L^5), (6) 

where, on the left hand side, the successive sums are over all the 
primes q not dividing k which satisfy the successive conditions 9=1 
(mod k), = 1 (mod k), etc. On the right hand side the first sum 

33 ""yyij. bemerken zuniichst, dass diese Reihen je nach der Wahl der in dem Ausdrucke 
vorkommenden Einheits-Wurzeln d,ri,u),u)' , . . . ein ganz verschiedenes Verhalten zeigen; da 
diese Wurzeln resp. a, 6, c, c', . . . verschiedene Werthe haben konnen, so sind in der Form L 
im Ganzen 

abcc' . . . = ip{k) 
verschiedene besondere Reihen enthalten ..." 



37 



is over all series L2 of the second class, and the second sum is over 
all conjugate pairs L^L'^ of series of the third class0 [29l §136] 

In a sense, Dedekind took the summations to range not over the characters 
or their representations, but over the L-functions themselves. Later, when 
dealing with these sums, he came closer to Dirichlet's presentation, in that the 
indices of the summation range over the sequences of roots that determine the 
characters. However, at times he introduced convenient abbreviations. For 
example, given a particular collection of roots of unity 9, ri,uj,uj', . . ., Dedekind 
denoted 9~"^r]~^^uj~^^uj'~''^ ... by x, where ai,/3i,7i,7j stand for the indices 
of the first term of the progression, m. Note that while Dedekind used the 
notation x(^) in a- footnote, he explicitly called X: as defined here, a value, and 
indeed in this case it stands for the value xl'^)- Moreover, when he used the 
symbol in a summation, Dedekind was explicit that the summation ranges not 
over X, but rather the roots of unity involved in the definition: 

The summation of all products x log L therefore gives the result 

where the successive sums on the left hand side are over all primes q 
satisfying the successive conditions q = m, = m, = m (mod fc) 
etc., while the sum on the right hand side is over all (^(fc) different 
root systems 9, 77, w', . . El [HI §137] 

In many ways, Dedekind did not stray far from Dirichlet's presentation. For 
the most part, his treatment of the characters was intensional, in the sense that 
the arguments rely on the particular representations of the characters. In other 
words, operations on the characters are described in terms of their represen- 
tations, rather than their values; and, like Dirichlet, he viewed summations as 
ranging over these representations in equations ([6]) and ([7|). Similarly, he clas- 
sified the characters depending on the roots involved in their representation, 

34 <i 

^e^) (E ^ + i E ^ + ■■■ + ^ E ^ + •■■)= i°g ^1 + E + E iog(L3Lj,), 

wo auf der linken Seite das erste, zweite Summenzeichen u.s.f. sich auf alle die in k nicht 
aufgehenden Primzahlen q bezieht, welche resp. den Bedingungen q = l,q^ = l (mod k) u.s.f. 
Geniige leisten; auf der rechten Seite bezieht sich das erste Summenzeichen auf alle Reihen L2 
der zweiten Classe, das zweite auf alle verschiedenen Paare L-^L'^ conjugirter Reihen dritter 
Classe." We have added the equation numbers in this quotation and the next, for later 
reference. 

"Die Summation allcr Producte x log L giebt daher das Resultat 

-(<E;;. + ^E^ + ^E^ + -)-Exio.^, 

wo auf der linken Seite das erste, zweite, dritte Summenzeichen u.s.f. sich auf alio Primzahlen 
q bezieht, welche resp. den Bedingungen q = m, q-^ = m, q^ = m (mod k) u.s.f. geniigen, 
wahrend das Summenzeichen auf der rechten Seite sich auf die sammtlichen ip{k) verschiedenen 
Wurzel-Systeme 0, ■q,ui,Lj' , . . ." 
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rather than their values. Nor did he take the expression L, for the L-functions, 
to dependent on the characters themselves, but rather as expressions that behave 
differently "depending on the roots of unity" appearing in their construction. 
The L notation in particular does nothing to signal this dependence. 

Nonetheless, he did take some key steps towards viewing the characters 
abstractly. To start with, Dedekind went out of his way to isolate the characters 
as independent of the L-series in which they appear, and, in particular, flagged 
them as entities satisfying certain key properties. And, at least at times, he 
characterized summations as ranging over the L-functions themselves, hinting 
at a new level of abstraction. 

The use of the term "character" in the modern sense can be traced to the 
long Supplement XI in the 1879 edition of the Dirichlet-Dedekind Vorlesungen, 
which contains a presentation of Dedekind's theory of ideals in algebraic number 
fields. Some background will be helpful here. Number theory has long been 
concerned with questions as to which numbers can be represented by a given 
algebraic expression in which the variables are taken to range over the integers. 
It is easy to characterize the numbers that can be represented by a linear form 
ax + 6 in one variable, and quadratic reciprocity addresses the problem of which 
numbers can be represented by a quadratic form ax^ + bx + c. When it comes to 
quadratic forms in two variables, things become more difficult. Fermat's famous 
theorem that the odd prime numbers that can be represented by the form x^+y"^ 
are exactly the ones that are congruent to one modulo four is considered a 
gem of number theory. Euler proved this and other of Fermat's claims in the 
eighteenth century, and Lagrange later extended the theory of binary quadratic 
forms considerably. A central contribution of Gauss' Disquitiones Arithmeticae 
is a complete classification of binary quadratic forms, whose study constitutes 
the bulk of that work. Gauss showed that it suffices to characterize "primitive" 
quadratic forms ax^ + 2bxy + cy^, where the second coefficient is even, and 
a, b, and c have no factor in common. (Quadratic forms are also classified as 
"indefinite," "positive definite," and "negative definite," and in the discussion 
that follows it should be assumed that we are fixing our attention on one of these 
fixed kinds.) He called the value D = b^ — ac the discriminant of the form, and 
showed how to assign to each primitive quadratic form of discriminant D a finite 
list of values of the form ±1 which, roughly speaking, characterizes its behavior. 
He called these values the characters of the form, and took the "genus" of a 
form to be the collection of all the primitive forms with the same discriminant 
and character. 

Already in the first edition of the Vorlesungen^ Dedekind used Gauss' ter- 
minology in Supplement IV, "Genera of quadratic forms." In 1879, however, 
Dedekind went further by showing that Gauss classification could be understood 
in terms of his theory of ideals. Specifically, he showed that there is a corre- 
spondence between genera of quadratic forms and equivalence classes of ideals 
in an associated quadratic extension of the rationals. Moreover, there is a group 
structure on the ideals, and Gauss' characters correspond to characters on that 
group, in the modern sense. This explains the quotation in Section[3] Dedekind 
had simply adopted Gauss' terminology to characterize the homomorphisms 
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from groups of ideals to the (nonzero) complex numbers more generally. 

As we noted in Section [3Jl by 1882, Weber |100] had already begun using the 
term "character" with respect to arbitrary finite abelian groups. He defined a 
(finite) abelian group as consisting of h elements "of any type" (iregend welcher 
Art) satisfying the usual group laws. He then stated the now-familiar structure 
theorem: 

In a finite abelian group of order h one can always choose elements 
Oi, Q2, • • • , of order rti, n2, . . . , n^, so that each element of G 
can be expressed uniquely by the form 

©re^ ...e:% 

where Si, S2, ■ ■ ■ , are chosen from complete residue systems with 
respect to the moduli ni, n2, . . . , rij^lff [1001 pp. 306-307] 

Weber called the sequence of values 61, 62, . . . , 6^ a basis of the group, and 
used them to define the characters as follows: 

If we assign, to the v elements Oi, ©2, . . • , of such a basis, v 
roots of unity wi , 0^2 , ■ • • , ^^i/ of order ni , n2 , • ■ ■ ,^1, respectively, then 
each element = 0^"^ ... 0^'^ of the group also corresponds to a 
particular h*^ root of unity 

We denote this root of unity by x(0) and call it the character of the 
elementEll p. 307] 

Weber went on to point out that this gives rise to h distinct characters, and 
that each such character x satisfies the equation 

xie)x{0') = x{oe'). 

Moreover, this last condition provides an exact characterization: 



"In einer Abel'schen Grupper G von Grade h kann man stets die Elemente 0i , ©2 , . . . ,@u 
von den Graden ni, n2, . . . , so auswahlen, dass in der Form 

jedes Element von G und jcdes nur einmal cnthalten ist, wenn si, S2, . . . , s,^ je einem 
voUstandigen Restsystem nach den Moduin ni , n2, . . . , rii/ entnommen wferden." 

^'^"Ordnet man den v Elementen ©i, ©2, • ■ • , einer solchen Basis u Einheitswrurzeln 
uil,ui2, ■ ■ ■ ,^^1/ von den Graden ni, n2, . . . , n^, so entspricht auch jedem Element © = 
©^^©2^ . . . Qt" der Gruppe eine bestimmte h^" Einheitswurzel oj nach der Vorschrift 

SI S9 St/ 

U) = U)-^'-U)2 ■ ■ ■ '^u ■ 

Wir bezeichnen diese Einheitswurzel mit x(0)> und nennen dieselbe den Charakter des Ele- 
mentes 0." 
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If, conversely, x(^) is ^ uniquely determined function of 9 which 
satisfies [the equation above] , then it is necessarily contained among 
these h characters!^ [ibid.] 

Thus Weber's paper provides us not only with the modern notion of a character 
defined on an arbitrary finite abelian group, but also with the modern under- 
standing that such characters constitute an instance of the function concept. 

There are some small differences between Weber's presentation and ours. In 
his presentation of the second orthogonality relation, Weber used ellipses rather 
than summation notation to sum over the characters: 

For each element 8: 

xi(e) + x2(e) + ... + x/.(e) = o, 

except for the identity element Oq, for which we have 

xi(eo) + x2(eo) + . . . + xhiOo) = hE 

[TUDl p. 308] 

This implicitly assumes that the h characters are enumerated Xi, ■ ■ ■ ,Xh, though 
the enumeration is arbitrary. Moreover, Weber provided the explicit construc- 
tion of the set of characters before the extensional characterization, whereas 
modern presentations usually provide the extensional characterization first. But 
these points are minor, and, otherwise, his presentation is little different from 
the one in Section [i^O 

Although Dcdekind revised his theory of ideals substantially in the third and 
fourth editions of the Vorlesungen, he did not revise any of the supplements that 
appeared in the first edition. In particular, he did not take advantage of the 
opportunity to go back to introduce the modern notion of a character in his 
presentation of Dirichlet's proof, although he clearly could have done so in the 
later editions. 



7.2 de la Vallee-Poussin 

More than two decades later, Charles Jean de la Vallee-Poussin gave another 
presentation of Dirichlet's theorem in a 1895/6 paper entitled "Demonstration 



^*Ist umgekehrt x(^) sine durch das Element 9 eindeutig bestimmtc Function, wclciie der 
Bedingung . . . geniigt, so ist dieselbe nothwending unter diesen h Charakteren enthalten. 
39 "p,i]. jedes Element © ist: 

xi(e)-(-x2(e) + ... + xh(©) = 0, 

ausgenommen fiir das Hauptclement Sq, fiir welches 

XI (Qo) + x2(©o) + . . . + xh(eo) = h. 



Mackey's historical survey 74 of the history of harmonic analysis include a very helpful 
and informative overview of the history of character-theoretic ideas in number theory. 
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simplifiee du Theoreme de Dirichlet sur la progression arithmetique," and again 
in sections from his 1897 book Recherches analytiques sur la theorie des nombres 
premiers. He introduced the characters in much the same way that Dirichlet 
and Dedekind did, namely, via an explicit construction in terms of primitive 
roots and roots of unity. Like Dedekind and Weber, he used the symbol x for 
the characters. Like Weber, he distinguished between then (f{M) characters 
modulo M using subscripts, writing xi,X2, ■ ■ ■ , Xv>{m)- 

de la Vallee-Poussin went on to subject the characters to a thorough study. 
Whereas Dirichlet and Dedekind divided the L-functions (or, perhaps more 
precisely, the corresponding expressions) into three categories based on their 
representations, de la Vallee Poussin based the categorization on the characters 
themselves. The following description applies to the simplest case, where the 
modulus M is a prime: 

One calls the character that corresponds to the root +1 the principal 
character] it is equal to unity for all numbers n. Apart from the 
principal character, there is only one which is real for all numbers 
n: it corresponds to the root (—1) and is equal to ±1 depending on 
the number n. 

We call all the other characters imaginary characters, though 
they may have a real value for some particular numbers. Their 
modulus is always equal to unitvF^ [HI P- 19] 

Thus de la Vallee Poussin divided the characters into three classes: 

1. the class consisting solely of the principal character; 

2. the class consisting of all real characters (in this case, there is only one 
character which is real for all numbers, corresponding to the root — 1); 
and 

3. The class consisting of all other characters, called the imaginary charac- 
ters. 

Notice that his categorization includes both intcnsional and extensional charac- 
terizations; that is, he characterized the characters in each class both in terms 
of the values they take, and the roots involved in their construction. 

In his study the characters in his 1896 paper, de la Vallee Poussin listed a 
number of "very important relations" (relations tres importantes) . The first is 
that x('T')x('^') = xC*^"-') foi' every character x, and n and n' . The second is 
that for a given character x modulo M, and any n and n', if n = n' (mod AI) 
then x('^) = x("')- The third and fourth are the first and second orthogonality 

"On appelle caractere principal celui qui correspond a la racine il est egal a I'unite 
pour tons les nombres n. En dehors du caractere principal, il n'y en a qu'un seul qui soit reel 
pour tous les nombres n: il correspond a la racine (—1) et est egal a ±1 suivant le nombre n. 

Nous donnerons a tous les autres caracteres le nom de caractere imaginaires, quoiqu'ils 
puissent avoir une valeur reelle pour certains nombres particuliers. Leur module est toujours 
egal a I'unite." 
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relations, respectively. In order to present the second orthogonality relation, de 
la Vallee Poussin introduced a new symbol, S, to denote summation over the 
characters!^ 

Consider . . . the sum extending over all the characters, that is to say 
over all the systems of roots 

SxX{n) = S^uj'^'uj^^ . . . 

. . . For every number n, the sum extending over the totality of char- 
acters satisfies 

SxXin) = 0, 

the only exception being the case where 

n = 1 (mod M), 
because then all the indices are zero and one has 

[M pp. 14-15] 

Although his notation is different from that of Dirichlet and Dedekind, his 
proof, like theirs, relies on an explicit calculation based on the construction of 
the characters, in contrast to the modern proof sketched in Section IT2l 

When introducing the i-functions in his 1896 paper, de la Vallee-Poussin 
simply described them in terms of their equivalent expressions as a sum and 
product. In 1897, however, he adopted a functional notation: 

We define the function Z(s, x mod M), for Ti{s) > 1, by the abso- 
lutely convergent expressions: 

n— 1 ^ y / 

"Considerons ... la somme etendue a tous les caracteres, c'est-a-dire a tous les systemes 
de racines 

SxX{n) = Sc^uj'^^ui^^ . . . 
. . . Pour tout nombre n, la somme etendue a la totalite des caracteres 

SxX{n) = 0, 

a la seule exception pres du cas oil 

n = l (mod M), 
car alors tous les indicateurs sont nuls et I'on a 

SxX{n) = v{M)." 
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where n designates successively all the integers prime to M and q 
all the prime numbers not dividing [191 P- 56] 

Aside from the choice of the letter Z instead of the letter L, we have finally 
arrived at the modern notation. But, like Dirichlet and Dedekind, he was reti- 
cent to quantify over the characters, using language the is eerily reminiscent of 
theirs. For example, in his 1897 work, he wrote: 

. . . one finds the fundamental equation 

{E)...- lim(s - l)f^ = lim(s - 1) ^ x(g)^, 

and this equation (E) represents in reality (p{M) distinct ones, which 
result from exchanging the characters amongst themselves CJ [IHl 
p. 65]. 

Nonetheless, there are a number of important respects in which de la Valle's 
presentation is close to the modern one. To start with, characters are treated as 
an object of study in their own right, bearing their own properties and relations. 
Moreover, they are classified extensionally, although this classification is, at 
the same time, related to properties of their representations. They are also 
represented notationally as arguments to the L-functions, hinting that they are, 
in a sense, on par with natural numbers. Finally, and most interestingly, de 
la Vallee-Poussin introduced a summation notation with an index ranging over 
the characters, not their representing data. The fact that he went out of his 
way to use one symbol, S, for summation of characters, and the usual E symbol 
for summation over sets of natural numbers hints that he does not conceive of 
these as precisely the same mathematical operation, although they have similar 
properties. 

7.3 Hadamard 

For each real number x, let Tr(x) denote the number of primes less than x. 
Around the turn of the century, both Legendre and Gauss conjectured that Tr{x) 
is asymptotic to x/ hi{x), in the sense that their ratio, tt{x) hix/x, approaches 1 

*''"Nous definirons la fonction Z{s,x mod M), pour 7?.(s) > 1, par les expressions absolu- 
ment convergentes 

n — 1 y / 

ou n designe successivement tous les nombres entiers premiers k M et q tous les nombres 
premiers qui ne divisent pas M." 

''^ ". . . on trouve 1' equation fondamentale 

{E)...- lim(s - 1)^^ = lim{. - 1) ^X{9)-^ , 

et cette equation (E) en represente en realite f{M) distinctes par I'echange des caracteres 
entre eux." 
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as X approaches infinity. This fact, now known as the "prime number theorem" 
was finally proved by both de la Vallee-Poussin and Jacques Hadamard, working 
independently, in 1896. de la Vallee-Poussin's proof was first published in the 
Annales de la Societe Scientifique de Bruxelles, but also appeared in the 1897 
book we discussed in the last section. Both Hadamard and de la Vallee-Poussin 
also obtained a generalization of the prime number theorem to arithmetic pro- 
gressions, which says that if m is relatively prime to k, the number of primes less 
than X that are congruent to m modulo k is asymptotic to (1/Lp{k)) ■ {x/lnx). 
This is, of course, a generalization of Dirichlet's theorem, since it implies that 
there are infinitely many primes congruent to m modulo k. 

Hadamard's 1896 paper was titled "Sur la distribution des zeros de la fonc- 
tion C(s) et ses consequences arithmetiques," and provided on a number of 
results concerning L-functions and characters. He introduced characters in the 
same way as Dirichlet, Dedekind and de la Vallee-Poussin, namely, by a con- 
struction in terms of roots of unity and primitive elements. Like Weber and de 
la Vallee-Poussin, he distinguished between the characters modulo k notation- 
ally by making use of subscripts, writing ipvin) where v runs from 1 to (p{k). 
He defined the L-functions as follows: 

ri=l 

The notation Ly{x), in contrast to de la Vallee-Poussin's Z{s,x), does not 
indicate an explicit dependence on the character, though the subscript pro- 
vides a link. As did his predecessors, Hadamard classified the L-functions into 
three different classes, but his classification was intensional, like Dirichlet's and 
Dedekind's, referring to the roots of unity used in the construction of the char- 
acters. 

When it came to summing over the characters, Hadamard, unlike Dirichlet, 
Dedekind, or de la Vallee-Poussin, let the index of the summation range over 
the subscripts described above. 

The fundamental equation that Dirichlet used in the demonstration 
of his theorem is 

ElogLi,(s) _ (i,\f\r^ 1 -'^ Y^' ^ 1 v^" 1 \ 
^ V'.M -^^''^ + ^ + 3^ 

where m is some integer prime to fc, and where, among the signs 
J2i J2' J X]"' • • fi'^st ranges over the prime numbers q such that 
q = m (mod k), the second ranges over the primes numbers q such 
that q^ =m (mod fc), etc@ gl p. 209] 

*^ "L'equation fondamental utilisee par Dirichlet pour la demonstration de son theoreme, 
est 

ElogL„(s) _ .,^/■<^-^ 1 -"^ Y^' ^ 1 1 A 

oh m est un entier quelconque premier avec k et oil les signes • ■ • s'etendent, le 

premier aux nombres premiers q tels que q = in (mod k), le second aux nombres premiers q 
tels que = m (mod fc), etc." 
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Once again, this is comparable to the modern formulation in equation 

Hadamard's presentation falls somewhere between those of Dirichlet and 
Dedekind, on the one hand, and that of de la Vallee-Poussin, on the other. 
His treatment of characters was, for the most part, intensional. But, in con- 
trast to Dirichlet and Dedekind, he left representational data out of the key 
summations, thereby eliminating unnecessary clutter. However, unlike de la 
Vallee-Poussin, he did not go so far to characterize these as summations over 
the characters themselves. Rather, he introduced natural number indices, v, 
labeling the characters, and then took the variable of summation to range over 
those. 

7.4 Kronecker 

Between 1863 and 1891, Leopold Kronecker lectured at the University of Berlin 
on a range of subjects, including number theory, algebra, and the theory of 
determinants. After Kronecker's death, his student, Kurt Hensel, edited the 
five volumes of his collected works, which were published between 1895 and 
1930. Hensel also took it upon himself to work Kronecker's copious lecture 
notes and course material into two textbooks, Vorlesugen iiber Zahlentheorie 
and Vorlesungen iiber die Theorie der Determinanten, which he published in 
1901 and 1903, respectively. The first of these closes with a proof of Dirichlet's 
theorem, which we will discuss here. 

Kronecker, in fact, wrote his doctoral dissertation on algebraic number the- 
ory under Dirichlet's supervision, completing it in 1845. Kronecker insisted that 
mathematics should maintain a clear focus on symbolic representations and al- 
gorithms, a commitment that is evident throughout his work. Avoiding talk 
of "arbitrary" functions, real numbers, and so on, Kronecker focused instead 
on the construction of algebraic systems and explicit algorithms for calculating 
with these algebraic representations. For example, his article "Grundziige einer 
arithmetischen Theorie der algebraischen Grossen" [53] provides means of car- 
rying out operations on systems of algebraic integers in finite extensions of the 
rationals. Similarly, "Ein Fundamentalsatz der allgemcincn Arithmetik" |65) 
provides an explicit construction of a splitting field for any polynomial with 
integer coefficients, and he viewed this as filling a gap in Galois' workF^ 

Kronecker's approach to number theory had a similar orientation. As Hensel 
put it in the introduction to Vorlesungen iiber Zahlentheorie: 

He believed that one can and must in this domain formulate each 
definition in such a way that its applicability to a given quantity 
can be assessed by means of a finite number of tests. Likewise, an 
existence proof for a quantity is to be regarded as entirely rigorous 
only if it contains a method by which that quantity can actually be 
found. Kronecker was far from one to completely reject a defini- 
tion or proof that does not meet these highest requirements, but he 

*^See Edwards |3H434| for a discussion of these works, and Kronecker's mathematics more 
generally. 
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believed that there was then something missing, and he held that 
completing it along these lines is an important task, by which our 
knowledge is furthered in an important sense0 L^S, P- vi] 

Kronecker's proof of Dirichlet's theorem is as a focal point of the book, in 
that it brings together methods and ideas developed throughout the entire work. 
Here is Hensel's characterization: 

[The book] closes with the proof of the famous theorem that any 
arithmetic sequence, whose first term and common difference are 
relatively prime, contains infinitely many prime numbers. But Kro- 
necker completed Dirichlet's proof of this theorem in a significant 
sense, in that he proved that one can determine, for an arbitrarily 
large number /i, a larger number jl, so that in the interval (/i • ■ • /I) 
one is sure to find a prime number of the required form. This nice 
supplement to that famous proof is a fruit of the higher demands, 
mentioned above, that Kronecker placed on arithmetic proofs. And 
here it seems, in fact, that with this improvement of Dirichlet's 
theorem, nothing by way of simplicity or transparency has been 
lostS [Sni p. viii] 

The interaction between analytic and number-theoretic methods is a central 
theme of the Vorlesungen, which opens with a discussion of Gauss' distinction 
between the fields of number theory, algebra, and analysis, and argues that 
they cannot be cleanly separated. For example, Kronecker praised Leibniz' 
characterization of tt in terms of the series 

TT 111 (-1)" 

- = 1 \ \ => — 

4 3 5 7 ^ 2n+l 

n—O 

as providing a definition of tt of a "fully number-theoretic character" [durchaus 
zalhlentheoretischem Charakter). 

^^"Er meinte, man konne und man miisse in diesem Gebiete eine jede Definition so fassen, 
dai3 durch eine endlichc Anzahl von Versuchen gepriift werden kann, ob sie auf eine vorgelegte 
Grofie anwendbar ist oder nicht. Ebenso ware ein Existenzbeweis fiir eine Grofie erst dann 
als vollig streng anzushen, wenn cr zuglcich ein Method enthalte, durch welche die GroBe, 
deren Existenz bewiesen werde, auch wirklich gefunden werden kann. Kronecker war weit 
davon entfernt, eine Definition oder einen Beweis vollstandig zu verwerfen, der jenen hochsten 
Anforderungen nicht entsprach, aber er glaubte, dafi dann eben noch etwas fehle, und er 
hielt eine Erganzung nach dieser Richtung hin fiir eine wichtige Aufgabe, durch die unsere 
Erkenntnis in einem wesenthchen Punkte erweitert wiirde." We have modified and extended 
a translation due to Stein |94l p. 250] . 

. . schliei3t mit dem Beweise des beriihmten Satzes, daii jede arithmetische Reihe, deren 
Anfangsglied und Difi'erenz teilerfremd sind, unendUch viele Primzahlen enthalt; aber Kro- 
necker vorvollstandigt den Dirichletschen Beweis dieses Satzes in einem wesentlichen Punkte, 
indem er nachweist, daB man fiir jede beliebige groB anzunehmende Zahl fi eine groBcrc Zahl 
/I so bestimmen kann, daB in dem Intervalle {fi ■ ■ ■ p,) sich sicher eine Primzahl der verlangten 
Form befindet. Dies schone Erganzung jenes beriihmten Beweises ist eine Frucht der oben 
erwahnten hoheren Forderungen, welche Kronecker an arithmetische Beweise stellte, und hier 
scheint es in der That, daB durch diese Verbesserung der Dirichletsche Beweis nichts an Ein- 
fachheit und Durchsichtigkeit verloren hat." 
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What these examples teach us holds, more generally, for all the defi- 
nitions of analysis. These always lead back to the integers and their 
properties, and from that entire branch of mathematics, only the 
concept of a limit has so far remained foreign. Arithmetic cannot 
be separated from analysis, which has freed itself from its origi- 
nal source, geometry, and has developed independently, on free soil. 
Even less so, as Dirichlet has succeeded in obtaining the deepest 
and most beautiful results in arithmetic by combining the methods 
of the two disciplines Iff] [BSl pp. 4-5] 

This interplay comes to the fore in the proof of Dirichlet 's theorem. The 
text reports that Kronecker's version of the proof, in the case where the com- 
mon difference is prime, was worked out in the lectures he gave in the winter 
semester of 1875/1876, whereas the general case was presented in during the 
winter semester of 1886/1887 [SBJ p. 442]. Fixing a modulus m and an r rela- 
tively prime to m, recall that Kronecker aimed to provide, for a given number 
H, an explicit upper bound fl on the integers one has to consider to find a prime 
number greater than ^ and congruent to r modulo m. When it came to the an- 
alytic part of the proof, Kronecker explained that obtaining the desired bound 
is reduced to obtaining a positive lower bound on a certain analytic series that 
arises in the proof. He wrote 

For the ambiguous [real] characters, Dirichlet's proof meets this re- 
quirement. But his methods are not sufficient to do the same for the 
series corresponding to the complex characterslf^ [651 P- 481] 

He went on: 

Generally speaking, this is a special case of the problem of finding, 
given a well-defined nonzero number, a bound, above which it nec- 
essarily lies. This is not as easy as it seems at first glance; indeed, 
in some circumstances, the problem can count among the thorniest 
questions known to sciencerH [551 pp. 481-482] 

Kronecker took the opportunity to clarify the methodological stance towards 
analysis that is appropriate to these issues. Even though, in his work, he avoided 

'^^ "Was uns diese Beispiele lehren, ist nun mafigebend fiir alle Definitionen der Analysis 
iiberhaupt. Dieselben fiihren stets auf die ganzen Zahlen und ihre Eigenschaften zuriick, und 
es ist von dem ganzen Gebiete des letzgenannten Zweiges der Mathematik der einzige BegrifF 
des limes oder der Grenze der Zahlentheorie bisher fremd geblieben. Gegen die Anaylsis 
also, die sich von ihrer urspriinglichen Quelle, der Geometrie, befreit und auf freiem Boden 
selbstandig entwickelt hat, kann die Arithmetik nicht abgegrenzt werden, um so weniger, als es 
Dirichlet gelungen ist, grade die schonsten und tiefliegenden arithmetischen Resultate durch 
die Verbindung der Methoden beider Disciplinen zu erzielen." 

50 "pjjj. (^jg ambigen Charaktere erfiillt der Beweis von Dirichlet auch diese Forderung, dage- 
gen reichen seine Methoden nicht aus, um dasselbe auch fiir die Reihen zu leisten, welche den 
complexen Charakteren entsprechen." 

"Uberhaupt ist das hier in einen speziellen Falle sich darbietende Problem, fiir eine von 
Null verschiedene wohldefinierte ZahlgroBe eine Grenze zu finden, iiber der sie notwendig liegen 
mufi, nicht so einfach, als es auf den ersten Blick erscheint, vielmchr kann diese Aufgabe unter 
Umstanden eine der heikelsten Fragen sein, die die Wissenschaft kennt." 
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the notion of an "arbitrary" real number, he recognized the importance of un- 
derstanding particular number systems in analytic terms. For example, given 
a symbolic representation of a real number, one may wish to compute rational 
approximations. But knowing that a nonnegative real number is nonzero is not 
the same as having a positive, rational lower bound on its valuesl3 Kronecker 
mentioned the problem of bounding a nonzero determinant away from zero as 
an example of a problem that is generally difficult. He also noted that, given two 
convergent series, it can be difficult to determine which one has a greater value. 
These facts are now quite familiar in constructive and computable analvsisF^ 

Kronecker's treatment of characters provides an interesting combination of 
Dirichlet's approach and modern ones. Fixing a modulus m, Kronecker, like 
Dirichlet, provided a fully explicit description of the characters modulo m in 
terms of primitive elements of the powers of primes giving m and primitive 
roots of unity. However, with judicious choice of notation, he was at the same 
time able to suppress extraneous detail. For example, fixing primitive elements 
7, 7o, 7i, . . . , 7g, one can express any element r relatively prime to m in the form 

r = 7''7o° • • • 73" mod m. 

But then one can "package" the representing values (p, po,pi,. . . , Pg) as a tuple, 
the "index system of r," denoted Indd r. For each of the cyclic groups in 
the decomposition of the group of units modulo m, suppose we also choose 
corresponding roots of unity, w, Wq, Wi, . . . , Wglffl 

Now let r be a unit modulo m, and Indd r = [p, p^, pi, . . . , pg); we 
now assign to r the root of unity: 



g 



which we call a character of r, since the index system (p, pq, pi, . . . , pg), 
and hence il(r), are uniquely determinedly [Ml p. 444] 

We obtain all the possible characters by fixing primitive roots of unity w, wo, . . . , Wg, 
so that every tuple of roots can be represented in the form 

, ,k fco , ,ki kg 

Lo ,a;Q , ... ,uj", 



^^In modern terms, one can obtain such a lower bound by computing rational approxima- 
tions until one obtains one that is sufficiently accurate to bound the number away from zero. 
Thus the statement "if r ^ 0, then |r| > 0" was accepted by the Russian school of constructive 
mathematics in the 1950's and 1960's. This implication is equivalent to "Markov's principle," 
which is, however, rejected by strict constructivists. See, for example, |96| . 

^■^See, for example, Troelstra and van Dalen |96) . 

^*As above, the powers of two require special treatment. In the discussion, Kronecker 
assumes that m is divisible by 8, in which case the units modulo that power of two form a 
product of two cyclic groups; the values of p and po a-re the indices in those two groups, and 
LO and ojo are the corresponding roots of unity. 

^^Es sie nun r eine Einheit modulo m, und Indd r = {p, po, p\, ■ ■ ■ , pg); ordnen wir r jetzt 
die Einheitswurzel: 

zu, so gehort zu jeder Einheit r eine und nur eine Einheitswurzel Vl{r), welche wir einen 
Charakter von r nennen woUen, denn durch r is ja das Indexsystem {p,po, ■ ■ ■ )i ^.Iso f2(r) 
eindeutig bestimmt. 
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where the k and fc^'s are less than the cardinahty of the corresponding cychc 
group. 

When there is no fear of misunderstanding, we wih, in the fohow- 
ing, denote the underlying exponent system (fc, /cq, fci, • ' " ) by (k) for 
short, and we will denote the corresponding character simply by 

Here, again, to each system of values {k, fcn , • • ■ ) and each unit r 
there clearly corresponds a character ri('^'(r)|£j 66, p. 445] 

In contrast, say, to Hadamard, the index (fc) is not arbitrary; rather, (fc) is taken 
to range over the specific representing data. But the notation and organization 
of the proof enables us to ignore the details of the representation where they 
are not needed. Kronecker noted immediately that r2''^)(r) depends only on the 
value of r modulo m, since any two values with the same residue have the same 
index. He also notes that we have f7('=)(rr') = n'-''^r)n<-''\r'). The choice of 
representation has the further nice property that ll^'^) (r)rj('=') (r) = n<-''+'''\r) 
where {k+k') denotes the result of adding the elements of the tuples (fc) and (fe'), 
modulo the cardinality of the associated cyclic groups. Kronecker presented the 
first orthogonality principle: 

^r!(o)(r)=^(m), 

(r) 

where r ranges over a system of residues of the units modulo m and (0) is the 
index of the trivial character, and 

^nW(r) = 

(r) 

for the remaining characters. Kronecker's proof requires unfolding the notation 
and calculating, but thereafter the fact can be recalled and used in the above 
form. Similarly, he expressed the second orthogonality relation by the equations 

(fc) 

when rg is congruent to 1 modulo m, and 

J2^'-''Hr) = 
(fc) 

56"Wenn kein MiBverstandnis zu befiirchten ist, woUen wir im folgenden das zu Grunde 
gelegte Exponentensystem (fc, fco, fci , . . .) kurz durch (A;) und den zugehoringen Charakter 
einfacher durch 

bezeichnen. Auch hier entspricht fiir ein festes Wertsystem (fc, fco, • ■ • ) jeder Einheit ofTenbar 
r ein Character C('=){r)." 
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otherwise. He also expressed the dependence of a Dirichlet series on a character 
in terms of a dependence on (fc), by writing 



oo 



1 



n 



1 - 



o('°)(p) ■ 



n=l 



P 



Kronecker's presentation provides us with an important lesson. The expo- 
sition is clearly designed to bring the central ideas to the fore and highlight 
relevant information, but succeeds in doing so while providing explicit represen- 
tations and algorithms throughout. This shows that although the moves towards 
abstraction that we have documented in this section can sometimes make it pos- 
sible to suppress or even ignore algorithmic information, they do not require one 
to do so. In other words, the conceptual reorganization opens the door to the 
use of other means of describing mathematical objects and operations on them, 
means that can supplant algorithmic aspects. The question as to whether such 
means are permissible, meaningful, and appropriate to mathematics lay at the 
core of twentieth century foundational debates. 

7.5 Landau 

Born in 1877, Edmund Landau received his doctorate at the University of Berlin 
in 1899, having studied number theory under Frobenius. He completed a Habili- 
tation thesis on Dirichlet series in 1901. Landau later presented proofs of Dirich- 
let's theorem in two textbooks, his 1909 Handbuch der Lehre von der Verteilung 
der Primzahlen [68' and his 1927 Vorlesungen iiber Zahlentheorie |69| . We shall 
here focus on his presentation in the 1909 work, since the later presentation is 
already essentially what we have portrayed as "contemporary" in Section 21 

Landau began by introducing the characters via the construction in terms 
of primitive roots and roots of unity. However, the notation that he used to 
denote them changed over the course of the book. Initially, he used a notation 
that is quite similar to Dirichlet's notation for i-functions, writing 



where the index system ai, 02, . . . , a,., a, 6 serves to distinguish the characters 
just as Dirichlet's a, b,c, c',.-- distinguished his i-functions. However, after 
proving that there are (/^(d) such characters, he simplified his notation to 



and to Xxiji) in the general case, thus adopting notation similar to both de la 
Vallee-Poussin and Hadamard. 

But whereas Landau constructed the characters in the same way as Dirichlet, 
Dedekind, Hadamard and de la Vallee-Poussin, he also recognized that they are 
characterized by certain key properties^ and that it is only these properties that 
are needed in the proof. Indeed, after introducing the abbreviations for the 
characters. Landau wrote: 




Xi{n),X2{n), . . . ,Xv(d)("), 
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... I will prove four short and elegantly- worded theorems about them 
[the characters]. The reader may then quickly forget the rather 
complicated definition of these functions completely, and need only 
remember that the existence of a system of h distinct functions which 
possesses the four properties has been provedEll ^ p. 404] 

The four theorems that Landau was referring to are the following: 

Theorem 1: For any two positive numbers n and n' , 

X{nn) = x{n)x{n'). 

This "law of multiplication" holds for each of the h functions . . . 
Theorem 2: For n = n' (mod fc), 

Theorem 3: When n runs through a complete residue system mod- 
ulo fc, for a; = 1, i.e. for the principal character 

^Xx{n) = h, 

n 

however for .x = 2, ... /i, i.e. for all other characters 

n 

Theorem 4: When n is fixed and the sum 

h 
x=l 

extends over all h functions, then 

h 

Xxin) = h for n = 1 (mod fc), 

x=l 

h 

Xxin) = for n ^ 1 (mod fc), 

x=l 

therefore for all fc — 1 other residue classes modulo pp. 401- 

408] 

". . . ich werde iiber sie vier Satze mit sehr kurzem und elegantem Wortlaut beweisen. 
Alsdann darf der Leser bald die recht komplizierte Definition dieser Funktionen vollkommen 
vergessen und braucht sich nur zu merken, dafi die Existenz eines Systems von h verschiedenen 
Funktionen bewiesen worden ist, welche die vier Eigenschaften besitzen." 
^^"Satz 1: Es ist fiir zwei ganze positive Zahlen n,n' 

X(nn') = x(")x(n')- 
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These arc exactly the four "important relations" given by de la Vallce-Poussin. 
What is novel here is Landau's explicit recognition that only these properties 
are used in the proof, and that the specific construction is only needed to show 
the existence of a system of functions that satisfy them. 

Like Hadamard, Landau did not go so far as to allow the characters them- 
selves to index the sum in theorem Theorem 4 above, relying on a natural num- 
ber proxy. But. like dc la Vallce-Poussin, he gave an cxtensional classification 
of the three types of characters, though he also gave an additional description 
of the real characters in intensional terms. 

Just as Landau's notation for characters changed over the course of the book, 
so, too, did his notation for L-functions. Initially, he denoted them in a manner 
similar to Hadamard's, writing Lx{s) = ^'^i"^ ■ However, later on, after 

concluding his presentation of the proof of Dirichlet's theorem. Landau adopted 
a "more convenient" notation: 

Now let 

«») = E^ 

be the function corresponding to the character x(n) = Xx{n); it is 
now more convenient to include the character in the notation, 

L{s,x), 

Von jedcr dcr h Funktioncn wird also dies "Multiplikationsgesetz" behauptet . . . 
Satz 2: Es ist fiir n = n' (mod k) 

X{n) = x(n') 

Satz 3: Wenn n ein voUstandiges Restsystem modulo k durchlauft, ist fiir x = I, d.h. fiir 
den HauptcharaJcter 

^Xx(n) = h, 

n 

dagegen fiir = 2, . . . , /i, d.h. fiir alle iibrigen Charaktcre 

Y.Xx{n) = 

n 

Satz 4: Wenn n festgehalten und die Summe 

h 

iiber alle h Funktionen erstreckt wird, so ist 

h 

Xx {n) = h fiir n = 1 (mod k) , 

dagegen 

h 

Xx{n) = fiir n ^ 1 (mod k), 

x=l 

also fur alle fc — 1 iibrigen Restklassen modulo k." 
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and, only when there is no fear of misunderstanding, write 

Lis) 

for short, as beforeEl ^ p. 482] 

Landau did not explain why the modern notation is more convenient, but we 
will offer some suggestions in the next section. 

Landau's 1909 presentation is interesting because it has a very transitional 
feel. Although he gave an intensional construction of the characters, he not only 
provided a thorough enumeration of their properties, but went out of his way to 
emphasize that these properties are all that matters to the proof. This is borne 
out by the fact that the usual division of characters into three classes can be 
carried out extensionally. And although he initially introduced natural number 
indices for the characters and L-functions and summed over these indices, he 
eventually adopted the modern functional notation for i-functions. 

When he wrote his 1927 textbook |69], Landau finally made the transition 
to a proof that is extremely close to the contemporary version. Indeed, he 
no longer defined the characters by describing how they are constructed, but, 
rather, defined them in terms of their characteristic properties. Moreover, he 
summed over the characters themselves, using expressions such as x(o)- The 
only real difference between his proof and the one we presented in Section [4.4l is 
that he did not develop the general notion of a group-theoretic character, but, 
rather, defined them in terms of the particular group of units modulo m. This 
makes sense, given that the work is an elementary number theory textbook 
and characters are needed for any other purpose. Roughly speaking, if we 
combine Landau's 1927 presentation with Weber's 1882 more general treatment 
of characters, the result the presentation in Section ID 



8 Mathematical concerns 

In Section 15. 2[ we considered a number of ways in which characters are treated 
as bona-fide objects in contemporary proofs of Dirichlet's theorem. We noted 
that none of these features are present in Dirichlet's original proof, for the sim- 
ple reason that Dirichlet did not isolate or identify the notion of a character. 

5^ "Es sei nun 

oo , V 

n — 1 

die dem Charakter x{n) = Xx{n) entsprechende Funktion; es ist jetzt bequemer, um die 
Charakter in die Bezeichnung aufzunehmen, 

L{s,x) 

zu schreiben, und nur, wcnn kein MiBverstandnis zu befiirchten ist, wie friiher kurz 



54 



Dedekind's 1863 presentation of Dirichlet's theorem did not use the term "char- 
acter," but he did introduce the notation, x(n), for characters, and identified 
the defining property of a homomorphism as their "characteristic property." As 
noted in Section mi after Weber's paper of 1882 the general notion of a group 
character was in place, and all the subsequently published proofs of Dirichlet's 
theorem use the terminology of characters. 

But now the set of characters modulo m can be defined extensionally, as the 
set of nonzero homomorphisms from (Z/mTj)* to the complex numbers, or inten- 
sionally, as functions defined by certain algebraic expressions involving certain 
primitive elements modulo the prime powers occurring in the factorization of m, 
and certain complex roots of unity. Even though the two definitions give rise to 
the same set of characters, proofs can differ in the extent to which they rely on 
the specific representations or the abstract characterizing property. Dirichlet's 
proof relied only on the symbolic representations. Somewhat surprisingly, both 
Dedekind's and Hadamard's division of the characters into the trivial, real, and 
complex cases was also described in terms of the characters' representations, 
even though the distinction is naturally expressed in terms of the values they 
take. Kronecker and de la Vallee-Poussin provided both descriptions, and even 
though Kronecker made it clear that all operations and classifications can be 
carried out, algorithmically, in terms of the canonical representations, his careful 
choice of notation and organization made the extensional properties salient. By 
1927, Landau clearly favored the extensional characterization in his textbook. 

We have also observed that modern notation like J2x ^(") ^^O'^s us to carry 
out summations over the finite set of characters modulo m, but that this nota- 
tion was not available to Dirichlet's early expositors. Dirichlet, Dedekind, and 
Kronecker all took summations to range over the tuples of integers representing 
the characters via an explicit algebraic definition, though Kronecker's way of 
letting a variable (r) range over these tuples is more attractive than Dirichlet's 
use of 0, b,c, c', — In 1882, Weber wrote his sums with eUipses. Curiously, 
Hadamard in 1896, and Landau in 1909, assigned arbitrary integer indices to 
the characters, and took sums to range over those indices. The only nineteenth 
century author to take summation over characters at face value was de la Vallee- 
Poussin, who nonetheless introduced separate notation to denote such sums. 
By 1927, however. Landau had adopted the modern notation. 

Finally, we have emphasized the modern tendency to represent the depen- 
dence of an L-series on a character x as a functional dependence, with the 
notation L{s, x)- Once again, de la Vallee-Poussin was the only nineteenth cen- 
tury expositor of Dirichlet's theorem to do so, with the notation Z(s,x). We 
saw that Landau made the transition in the middle of his 1909 book, and that 
in his 1927 textbook he relied exclusively on the notation L(s,x)- 

At issue in all these developments is whether characters could be treated in 
much the same way as natural numbers and real numbers, or whether charac- 
ters are different sorts of objects, whose treatment has to be mediated by more 
"concrete" mathematical representations. We contemporary readers of the nine- 
teenth century literature are now so familiar with a modern perspective that it 
can be hard for us to appreciate the reasons it took so long for the community 
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to adopt it. Let us begin our analysis of the history, then, by reflecting on the 
countervailing pressures. 

We have had a lot to say about the notational and expressive conventions 
at play. The difficulty of settling on stable and useful conventions should not 
be minimized. For example, any mathematician writing a proof has to choose 
names for variables: should one use m, n, and k to range over natural numbers, 
or X, y, and zl One desiderata is to maintain continuity with the background 
literature, but other constraints come into play: for example, one reason to favor 
m, n, and k may be that x, y, and z are natural choices to range over other ob- 
jects, like real or complex numbers, arising in the proof. It is significant, though 
not surprising, that Dedekind, 26 years after Dirichlet's proof was published, 
and de la Vallee-Poussin, fully 60 years later, both stuck with Dirichlet's choice 
of m and k for the first term and common difference in the arithmetic progres- 
sion, respectively. Even today, q is often used to range over prime numbers in 
the definition of the L-series, and we still use the letter L exclusively, though 
most contemporary proofs use a and d in place of m and k. 

In a similar way, one has to settle on choices of notation. This also requires 
thought, even in places where the norms that govern the notation are fairly 
clear. For example, assuming one divides the set C of characters modulo k into 
the set consisting of the trivial character, Ctriv, the set of real characters, Creai, 
and the set of complex characters, C complex, it is clear that a sum over C can 
be broken up accordingly: 



But, even so, one has to settle on the notation to express this relationship, and, 
as the history of Dirichlet's theorem shows, it can be a long time before a partic- 
ular means of expression becomes standard. The modern notation emphasizes 
the parallels between summing over sets C of characters and summing over sets 
C of natural numbers, but it is by no means obvious that conflating the two is 
a good idea. 

Contemporary readers of Frege's Begriffsschrift 03] and Grundgesetze [47] . 
which we will discuss in Sections [9] and 1101 may be struck by how many pages 
are devoted to explaining the syntax of the formal language. For those familiar 
with modern logic, Frege's lengthy explanations seem fiddly and pedantic. But, 
for Frege, getting the grammatical rules worked out was much of the battle. The 
syntax of a language only seems trivial when you already know how to speak it. 

But it would be a mistake to suggest that all the considerations that Dirich- 
let's successors faced were "merely" notational. Means of expression often rely 
on substantial features of our understanding of the nature of that which is ex- 
pressed. For example, consider the modern way of writing Dirichlet's famous 
example of 1927: 




a if a; is rational 



h otherwise. 
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This bears superficial similarity to case-based definitions of number-theoretic 
functions, or piecewise definitions of analytic functions, that were familiar at 
the time. But it was novel to base a case distinction on the property of being 
rational, and, indeed, the notation masks significant assumptions about what it 
means to define a function on the reals. Consider Frege's brief account of the 
evolution of the function concept in 1891: 

In the first place, the field of mathematical operations that serve for 
constructing functions has been extended. Besides addition, mul- 
tiplication, exponentiation, and their converses, the various means 
of transition to the limit have been introduced — to be sure, people 
have not always been clearly aware that they were thus adopting 
something essentially new. People have gone further still, and have 
actually been obliged to resort to ordinary language, because the 
symbolic language of Analysis failed; e.g. when they were speak- 
ing of a function whose value is 1 for rational and for irrational 
arguments^ [HI p. 12] 

Frege was not merely concerned to have a convenient notation to express the 
definition by cases. The definition of / above does that perfectly well, even 
today. The point is rather that the notation should come with clear rules of use. 
That is what Frege took to be lacking in the casual use of ordinary language, and 
what he took his formal system to provide. Similar methodological concerns lie 
beneath the surface whenever we write lim„ a„ to denote the limit of a sequence 
of real numbers, or / + J to denote the sum of two ideals in a ring of algebraic 
integers. There is nothing tame about the infinitary operations that underlie 
the notation. 

In contrast, foundational questions regarding the use of characters may seem 
mild. After all, it is easy to represent the characters modulo a positive integer 
fc, and any character is determined by the values it takes on the finitely many 
residues modulo k. Nonetheless, the use and treatment of characters in proofs 
of Dirichlet's theorem bears upon central questions regarding the use and treat- 
ment of functions more generally, specifically regarding the relationship between 
a function and its various representations. 

Let us think of Dirichlet L-functions L(s, x) in quasi-computational terms. 
Such a function should take, as input, a real (or complex) value s and a character 
X, and return a complex number. Let us set aside the (important) question as 
to what it means to take a real or complex number as input, or return one as 
output. What does it mean to accept a character as an input? Should one think 
of the character as being "presented" to the function as an infinite set of input- 
output pairs? Or as the list of finite values on the residues modulo fc? (In that 

"Erstcns namlich ist der Kreis der Rechnungsarten erweitert worden, die zur Bildung einer 
Funtion boitragen. Zu der Addition, Multiplikation, Potenzierung und deren Umkehrungen 
sind die verschiedenen Arten des Grenziiberganges hinzugekommen, ohne dai3 man allerdings 
immer ein klares BewulStsein von dem wesentlich Ncucn hatte, das damit aufgenommcn werde. 
Man ist weiter gegangen und sogar genotigt worden, zu der Wortsprache seine Zufiucht zu 
nehmen, da die Zeichensprache der Analysis versagte, wenn z.B. von einer Funktion die Rede 
war, deren Wert fiir rationale Argumente 1, fiir irrationale ist." 
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case, does the same conception work for function arguments that arc not finitely 
determined?) Should one think, rather, of x as being some sort of procedure, 
or subroutine? If so, what sorts of procedures and subroutines are allowed in 
the definition of a functional with argument x? Perhaps, instead, wc should 
identify X) as Dirichlet did, with its representation in terms of an expression 
involving certain roots of unity. But recall that those representations relied 
on choices of primitive elements in the representation of fc, although it turns 
out that the value of the L-function does not vary with different choices of 
the primitive elements. On our conception, does L somehow "depend" on the 
choices of primitive elements? 

These issues arise not only with respect to functional notation, but also 
with respect to statements involving quantifiers over characters. For example, 
the proof of the second orthogonality lemma requires the following fact: 

If n is relatively prime to k, there is a character x such that x('^) 1- 

If wc prove this as a separate lemma, wc can then invoke the lemma for a 
given k to "obtain" a x with the relevant property. But what exactly have 
we "obtained"? A table of values? A procedure? A representation? If the 
lemma constructs a x via a representation of one sort, but our proof of the 
second orthogonality lemma relies on a different representation of characters, is 
it legitimate to apply the lemma in that context? 

The modern theory of computability and the semantics of programming lan- 
guages offers various ways of thinking about computer programs which take 
functions are arguments. The issues are subtle and complex. In contrast, mod- 
ern mathematics followed a route whereby these subtleties are for the most part 
set aside. In particular, they are deemed incidental to the proof of Dirichlet's 
theorem. Roughly speaking, to make sense of a functional F{f) with function 
argument /, set theory identifies / "canonically" with its extension, a set of 
input-output pairs, without concern as to how / is represented or how (and 
whether) one can "compute" F{f). Set theory then imposes on mathemati- 
cal language the restriction that the definition of such a functional F can only 
depend on the extension of /. (Or, put differently: modern mathematical con- 
ventions evolved to ensure the latter fact, and axiomatic set theory was designed 
to model and explain those conventions.) 

In the nineteenth century, the answers to the questions raised above were not 
at all obvious. Indeed, they are still debated among logicians and foundationally- 
minded mathematicians today. Even for those inclined to dismiss those ques- 
tions as irrelevant to the proof of Dirichlet's theorem, it would not have been 
immediately clear as to whether they really could be dismissed, and, if so, how 
that should be done. What may seem to be "merely notational" developments 
in the presentation of Dirichlet's theorem were part and parcel of the broader 
mathematical community's attempt to fashion an understanding of functions as 
objects that would better support the mathematics of the time. 

Setting aside issues of meaning, there may also be concerns about correct- 
ness. Any choice of notation that draws on an analogy between different domains 
presupposes that the analogy is appropriate, which is to say, that sufficiently 
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many properties carry over, and that there are sufficient safeguards to bar the 
ones that do not. For example, a sum of the form '^^^gt(x) generally makes 
sense when S is finite and the term t{x) takes values on a domain with an 
addition that is associative and commutative, so that the order the one "runs 
through" the elements of S does not matter. Summing over infinite totalities 
is more subtle; it typically depends on having a notion of convergence for the 
domain in which t(x) takes values, and worrying about the order in which terms 
are summed. Now, choosing the notation t(x) may not be a good idea if 
it encourages users to transfer facts and properties between domains in an in- 
valid way. This would explain why de la Vallee-Poussin chose a special notation 
iS*;^ for such sums, as such a new symbol would not come with any unwanted 
baggage. Adopting the contemporary notation may have required some kind of 
assurance that the intended contexts would be sufficient to control for proper 
use. 

In sum, mathematical conventions evolve, expand, and change. Any time 
a mathematician writes a line of text, he or she is situated in a tradition with 
implicit norms and conventions, and the line of text just written becomes part 
of that tradition. The fact that so many of these conventions are communicated 
implicitly does not make them any less important to mathematical understand- 
ing. 

When a mathematician writes a proof, the intent is that the inferences con- 
tained therein will be deemed by his or her colleagues to be correct and justified. 
Where the inferences are instances of familiar patterns, one desires that they 
will be easily recognized as such, so that the reader's effort is conserved for more 
substantial cognitive tasks. When the inferences rely on assumptions that may 
be considered dubious, or push familiar patterns of reasoning into unexplored 
territory, there is greater concern not only as to whether the reasoning will 
be recognized as correct, but also as to whether it will be deemed appropriate 
to addressing the mathematical issues at hand. Thus there are always strong 
pragmatic pressures to stick close to established convention, and one should 
not expect fundamental aspects of the language of method of mathematics to 
change in novel ways, unless there are strong forces pushing for such change. 

In the case of Dirichlet's theorem, we have seen some of the ways in which 
Dirichlet's successors chose to modify, or "improve," his presentation. Now let 
us try to understand some of the perceived benefits. Kronecker's proof was 
explicitly designed to fill in information that was absent from Dirichlet's proof, 
in the form of explicit bounds on a quantity asserted to exist. Many of the 
other developments were explicitly designed towards paving the way to useful 
generalizations. This was clearly Dedekind's intent, for example, in pointing 
out that the Euler product formula depended only on certain multiplicative 
properties of the terms occurring in the sum. Similarly, abstracting proofs of 
properties of characters on (Z/mZ)* that rely on features specific to the integers 
modulo m paves the way to extending these properties to group characters more 
generally. Characters and their properties form the basis for representation 
theory, which has been an essential part of group theory since the turn of the 
twentieth century [56l[74]. Authors like Hadamard, de la Vallee-Poussin, and 
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Landau were also interested in extending Dirichlet's methods to other kinds of 
Dirichlet series, which now play a core role in analytic number theory. 

But it would be a mistake to attribute all the benefits of the expository inno- 
vations we have considered to increased generality. After all, these innovations 
play an equally important role in fostering a better understanding of Dirichlet's 
proof itself, by highlighting key features of the concepts and objects in question, 
motivating the steps of the proof, and reducing cognitive burden on the reader 
by minimizing the amount of information that needs to be kept in mind at each 
step along the way. It is true that these benefits often support generalization, 
but they do so in part by making our thinking vis-a-vis Dirichlet's proof itself 
more efficient. 

For example, proofs of Dirichlet's theorem that rely on explicit representa- 
tions of the characters require us to keep the details of the representation in 
mind. Recall Dirichlet's original representation of an arbitrary character: 

X{n) = e°'ip^Lo'<uj'''' ■ ■ ■ 

Here the reader has to keep in mind that a, /3, 7, 7', and so on are the indices of 
n with respect to primitive elements chosen in the decomposition of the abelian 
group (Z/toZ)*, and 9, w, w', and so on are corresponding roots of unity. 
This information has to be kept in mind throughout the proof, because the 
nature of the objects and the dependences of q;,/3,7,7' on n may play a role 
in licensing an inference or calculation. Moreover, recall that we obtain all the 
characters by expressing all the roots 

in terms of primitive roots of unity Q, $, il, fl' , . . . , and letting a, b, . . . range 
over the appropriate exponents. Once again, this information has to be re- 
membered throughout. Later proofs are easier to read simply because they do 
not require us to keep as much information in mind, and highlight the relevant 
dependences when they are needed. 

One way of achieving this is by reorganizing the proof in such a way that 
some of the relevant information is localized to particular facts and calculations. 
For example, even if one resorts to representations to prove the orthogonality 
relations, if this is the only place they are used, then they do not need to 
be ready to hand when these relations are invoked in a calculation later on, 
where other analytic expressions and their properties are the objects of focus. 
Thus modularity reduces cognitive burden, and makes it easier to keep track of 
the global structure of the argument, providing high-level outlines, or sketches, 
of the proof. Such restructuring paves the way to generality: isolating context- 
specific details in well- insulated modules means that one can adapt the proof by 
changing the modules while preserving their external interfaces. But, to repeat, 
generality is not the only benefit: the restructuring improves the readability of 
the original proof as well. 

Even in situations where there is a lot of information in play at once, judi- 
cious notation and means of expression can make important relationships be- 
tween the data more salient. For example, in Section |6l we saw that Dirichlet 
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wrote 




where we would write 

The second presentation highlights the relationship between the series L{s, x) 
and the characters; in particular, the value L{s, x) is multiplied by the value of 
the conjugate character, x, at m in each term of the sum on the left-hand side. 
Dirichlet's notation obscures this relationship, since the logarithm of the appro- 
priate i-series ia,b,c,c'... is multiplied by the character 0-"'"" <i)~^™''0~''''"'^f2^'^m''^ . . . 
and the reader must sift through this expression to notice that the exponents 
— a, — b, — c, — c', . . . correspond to the subscripts of the L-series. 

The same problem applies to Hadamard's method of using arbitrary indices 
for the characters: given a character -0^,, there is no natural name for the index 
of its conjugate Of course, one can introduce such a notation, but that 
requires keeping track of the relationship between the two notations, that is, 
the conjugation of characters and the associated operation on indices. And 
grouping the characters into different classes requires grouping the indices into 
different classes, again yielding an uncomfortable duality. (At least Kronecker 
maintained a monist consistency, insisting that all operations on characters are 
operations on their representations. So when £7^'^^ is a character, Kronecker 
could write fi*^"'^^ for its conjugate, since the conjugate character is obtained by 
negating the elements of the corresponding tuple.) 

Similar considerations may explain why Landau changed his notation for L- 
functions in the middle of his 1909 work, from Lj.{s) to L{s, x)- In the paragraph 
following his notation change. Landau showed that the theory of L-functions 
can be reduced to L-functions that correspond to particular types of characters, 
called proper characters. Roughly, proper characters modulo k are those which 
cannot be obtained as a character modulo K where K < k. To show that the 
theory of L-series can be reduced in the appropriate way, he proved that for an 
improper character modulo k and cr > 1, we have [68, 482-483]: 

Here Lq(s, X) is an L-series corresponding to the proper character X modulo K, 
£p are certain roots of unity, and c a number (which can be 0, depending on how 
many prime factors of k are contained in K) . If Landau had kept his original 
notation, the left hand side of the above equation would be written as Lx{s) 
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where x — Xx- But how would we represent the right hand side? We should 
note that one obvious choice for an index, x', would not be available, since 
Landau used this previously in connection with the conjugates of characters. 
In the text, Landau had identified a relationship between a character, x, and 
the corresponding proper character, X; having to translate this to a relationship 
between indices would only clutter the exposition. These issues are compounded 
when, in later sections. Landau wanted to obtain functional equations to relate 
L(s,x) and L(l — s,x) and in doing so referred to the distinction between proper 
and improper characters (see e.g. [SHI §130]). Using the old notation, we would 
need to keep track of three different subscripts to index the various characters. 

Consider, finally, the issue of uniformity of notation. We have already dis- 
cussed concerns associated with the notation for summation over characters. 
But there is an obvious benefit to using the same notation for summing over 
finite sets of characters and summing over finite sets of integers, namely, that 
the two operations really do share common properties. Indeed, the transfer is 
immediate via the Hadamard trick of assigning an integer index to each charac- 
ter. The option of introducing a new symbol every time one needs to index sums 
by finite sets of objects is clearly untenable, as it would result in a confusing 
explosion of notations. It does not seem at all surprising that de la Vallee's 
notation S-^ was short-lived. 

To sum up, then, we have identified a number of advantages to the rewritings 
of Dirichlet's proof considered in the last section: 

• Essential properties of key objects (or expressions) were isolated, reducing 
the amount of information that someone reading the proof has to keep in 
mind at each step. 

• The organization of proofs became more modular, with information local- 
ized to very specific parts of the proof. 

• Expressions became more readable, since irrelevant details were suppressed, 
and the features that remained made dependences and relationships be- 
tween terms more salient. 

• Notation became more uniform, highlighting commonalities between dif- 
ferent domains. 

Changes like this often go hand in hand with attempts to generalize concepts 
and methods to other domains, since managing and controlling the volume 
of domain-specific detail tends to bring to the fore aspects of the proof that 
transcend these specifics. But they also contribute to a better understanding of 
Dirichlet's proof itself, and make the proof easier to read and reproduce from 
memory. 

Benefits such as these are often dismissed as merely "pragmatic" or "cog- 
nitive," but this downplays the fact that such considerations effectively shape 
and justify the norms that guide our mathematical practice. The philosophy of 
mathematics needs to take them seriously. 
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9 Function and object in Frege 



In 1940, Alonzo Church presented a formulation of type theory 117], now known 
as "simple type theory." Simple type theory can serve as a foundation for a 
significant portion of mathematics, and, indeed, is the axiomatic foundation of 
choice for a number of computational interactive theorem provers today |51II541 
[52]. One starts with some basic types, say, a type B of Boolean truth values and 
a type N of natural numbers, and one forms more complex types axr and a ^ t 
from any two types a and r. Intuitively, objects of type cr x r are ordered pairs, 
consisting of an object of type a and an object of type r, and objects of type 
a ^ T are functions from a to r. In a type-theoretic approach to the foundations 
of mathematics, one identifies sets of natural numbers with predicates, which is 
to say, objects of type N — B. Binary relations on the natural numbers are then 
objects of type N x N B, and sequences of natural numbers are objects of type 
N ^> N. Objects at this level are called type 1 objects, because they require 
one essential use of the function space arrow. Integers can be identified as 
pairs of natural numbers and rationals can be identified as integers in the usual 
ways. Real numbers are then Cauchy sequences of rationals (type 1 objects), or 
equivalence classes of such, which puts them at type 2. Functions from the reals 
to the reals and sets of reals are then objects of type 3, and sets of functions 
from the reals to reals or collections of sets of real numbers are then objects of 
type 4. For example, the collection of Borel sets of real numbers is a type 4 
object, as is Lebesgue measure, which maps certain sets of real numbers to the 
real numbers. A set of measures on the Borel sets of the real numbers is a type 
5 object. And so on up the hierarchy. 

Simple type theory can be viewed as a descendent of the ramified type the- 
ory of Russell and Whitehead's Principia mathematica |92) . which, in turn, was 
inspired by the formal system of Frege's Grungesetze der Arithmetik |47| . Start- 
ing with a basic type of individuals, Frege's system also has variables ranging 
over higher-type functionals, and so can be seen as an incipient form of modern 
type theory. For that reason, it can then come as a surprise to logicians familiar 
with the modern type-theoretic understanding that the foundational outlook 
just described is not at all the image of mathematics that Frege had in mind. 
It is this image that we wish to explore here. 

Frege took concepts to be instances of functions; for example, in "Function 
and concept" he wrote that "a concept is a function whose value is always a 
truth value" gS] p. 139] IB And, throughout his career, he was insistent that 
functions are not objects. The third "fundamental principle" in his Grundlagen 
der Arithmetik of 1884 was "never to lose sight of the distinction between con- 
cept and object"lB [HJ Introduction], and he later asserted that "it will not do 
to call a general concept word the name of a thing" |^44, §51]!^ The distinc- 

®^We should note that in this section we will focus on his views from 1884 onwards. Prior 
to this, he seems to have held a different view of concepts, though he still maintained that 
they are not objects; see 1581 p. 136]. 

. .der Unterschied zwischen Begriff und Gegenstand ist in Auge zu behalten." 
". . . ist es unpassend, ein allgemeines Begriffswort Namen eines Dinges zu nennen." 
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tion features prominently in his essays "Function and Concept," "Comments 
on Sinn and Bedeutung" and "Concept and object" of 1891, 1891/2, and 1892, 
respectively. 

Frege's distinction was primarily linguistic: objects are denoted by words 
and phrases that can fill the subject role in a grammatical sentence, whereas 
concepts are denoted by words and phrases that can play the role of a predicate. 
In "Concept and object" he wrote: 

We may say in brief, taking "subject" and "predicate" in the lin- 
guistic sense: a concept is the Bedeutung of a predicate; an object 
is something that can never be the whole Bedeutung of a predicate, 
but can be the Bedeutung of a subject|f3 [IHl PP- 198] 

And: 

A concept — as I understand the word — is predicative. On the other 
hand, a name of an object, a proper name, is quite incapable of 
being used as a grammatical predicatej^ [ISl pp. 193] 

In the sentence, "Frege is a philosopher," the word "Frege" denotes an object, 
and the phrase "is a philosopher" denotes a concept. Frege clarified the distinc- 
tion by explaining that functional expressions, including concept expressions, 
are "unsaturated," or incomplete. These stand in contrast to signs that are 
used to denote objects, which are complete in and of themselves. For example, 
in the sentence "Frege is a philosopher," the expression "Frege" is saturated, 
and succeeds in picking out an object. In contrast, the expression ". . . is a 
philosopher" contains a gap, and fails to name an object until one fills in the 
ellipsis, at which point the expression denotes a truth value 

Having distinguished between concepts and objects in such a way, Frege 
had to deal with objections, such as the one he attributed to Benno Kerry in 
"Concept and object." In the sentence "The concept 'horse' is a concept easily 
attained" the concept denoted by 'horse' does fill the subject role. Frege's 
surprising answer was to deny that the phrase "the concept 'horse' " denotes a 
concept. He conceded that this sounds strange: 

^■'"Wir konnen kurz sagen, indem wir "Pradikat" und "Subjekt" im sprachlichen Sinne ver- 
stehen: BegrifT ist Bedeutung eines Pradikates, Gegenstand ist, was nie die ganze Bedeutung 
Pradikates, wohl aber Bedeutung eines Subjekts sein kann." The word Bedeutung is often 
translated as "reference" or "denotation." But for difficulties in the translation, see §, 4 of 
the introduction to Beaney [8]. 

"Der Begriff — wie ich das Wort verstehe — ist pradikativ. Ein Gegenstandsname hingegen, 
ein Eigenname ist durchaus unfahig, als grammatisches Pradikat gebraucht zu werden." 

^^While the distinction between saturated and unsaturated expressions is cast as a distinc- 
tion between linguistic signs, in his f904 essay "What is a Function?" Frege made it clear 
that the dichotomy extends to functions and objects themselves: "The peculiarity of func- 
tional signs, which we here called 'unsaturatedness', naturally has something answering to it in 
the functions themselves. They too may be called 'unsaturated' ..." ("Der Eigentiimlichkeit 
der Fuktionszeichen, die wir Ungesattigtheit genannt haben, entspricht natiirlich etwas an den 
Funktionen selbst. Auch diese konnon wir ungesattigt nennen . . .") [48! p. 665]. 
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It must indeed be recognized that we are confronted by an awk- 
wardness of language. . . if we say that the concept horse is not a 
concept. . . m PP- 196-197] 

Yet, he insisted, this is what we must do. He was already clear about this in 
the Grundlagen: 

The business of a general concept word is precisely to signify a con- 
cept. Only when conjoined with the definite article or a demonstra- 
tive pronoun can it be counted as the proper name of a thing, but 
in that case it ceases to count as a concept word. The name of a 
thing is a proper name@ [44l §51] 

And so, in "Concept and object," he reminded us: 

If we keep it in mind that in my way of speaking expressions like 
"the concept F" designate not concepts but objects, most of Kerry's 
objections already collapse|f^ [46, pp. 198-199] 

He similarly urged us to reconstrue expressions like "all mammals have red 
blood" as "whatever is a mammal has red blood" so as to avoid the impression 
that the predicate "has red blood" is being applied to an object, "mammal." 
Although these examples deal with concepts, Frege's analysis makes it clear 
that he intended the linguistic separation to remain decisive for other kinds of 
functions as well. 

At the same time, Frege was equally dogmatic in insisting that what we 
commonly take to be mathematical objects really are mathematical objects as 
such. The introduction to his Grundlagen begins as follows: 

When we ask someone what the number one is, or what the symbol 



a brazen rhetorical flourish to frame the whole project makes it clear just how 
central the issue is to his analysis. Once again, his reason for maintaining the 
distinction is largely linguistic; for example, because the number 7 plays the 
role of a subject in the statement "7 is odd," 7 must be an object. But, once 

^^"Es kann ja nicht verkannt werden, daB hier eine freilich unvermeidbare sprachliche Harte 
vorliegt, wenn wir behaupten: der Begrifl Pferd ist kein Begriff . . . ." 

"Ein allgemeines Begriffswort bezeichnet eben einen Begriff. Nur mit dem bestimmten 
Artikel oder einem Demonstrativpronomen gilt es als Eigenname eines Dinges, hort aber damit 
auf, als Begriffswort zu gelten. Der Name eines Dinges ist ein Eigenname." 

"Wenn wir festhalten, dafi in meiner Redeweise Ausdriicke wie "der Begriff F" nicht 
Begriffe, sondern Gegenstande bezeichnen, so werden die Einwendungen Kerrys schon 
grofitenteils hinfallig." 

"Auf die Frage, was die Zahl Eins sei, oder was das Zeichen 1 bcdeute, wird man meistens 
die Antwort erhalten: nun, ein Ding." All our translations from the Grundlagen are taken 
from the Austin translation cited in the references. 

^-"^We are grateful to Steve Awodey for this observation. 
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again, Frege had to deal with sentences where the syntactic role of a number 
is murkier. For example, he considered uses of number terms in language that 
are attributive and do not occur prefixed by the definite article, for example, 
"Jupiter has four moons" [44, §57]. He wrote 

"... our concern here is to arrive at a concept of number usable for 
the purposes of science; we should not, therefore, be deterred by 
the fact that in the language of everyday life number appears also in 
attributive constructions. That can always be got round. "0 |331 §57] 

Specifically, it can be got round by writing an attributive statement such as 
"Jupiter has four moons" as "the number of Jupiter's moons is the number 4, 
or 4" [44, §57], thereby eliminating the attributive usage. 

So, for Frege, functions are not objects, but numbers are, because they 
play the subject role in mathematical statements and can be used with the 
definite article. There is clearly a difficulty lurking nearby. At least from a 
modern standpoint, we tend to view functions, sequences, sets, and structures as 
objects, and certainly in Frege's time locutions such as "the function /" and "the 
series s" were common. Frege's response was similar to his response to Kerry's 
objection, namely, to deny that that expressions like these denote functions. To 
understand how this works, consider the fact that Frege's logical system includes 
an operator which takes any function / from objects to objects and returns an 
object, e/(e), intended to denote its "course-of-values" or "value range." If / 
is a concept, which is to say, a function which for each object return of values, 
the course-of-values of / is called the "extension" of the concept. Frege's Basic 
Law V asserts that two functions which are extensionally equal — that is, which 
return equal output values for every input — have the same courses-of-values. 

Frege used these courses-of-values and extensions as object-proxies for func- 
tions and concepts. This is how he analyzed the concept of a cardinal number. 
Let F, for example, be a second-level concept, such that F holds of a first-level 
concept / if and only / holds of exactly one object. Frege took the number one 
to be the extension of F, thereby achieving the goal of making the number one, 
well, a thing. But this "pushing down" trick is central to the methodology of the 
Grundgesetze: whenever the formal analysis of common mathematical objects 
seems to suggest identifying such objects as functions or concepts, Frege avoided 
doing so by replacing the function or concept with its extension. For example, 
in the Grundgesetze he circumvented the need to define mathematical opera- 
tions on sequences and relations construed as functions, defining the operations 
rather on the associated courses-of-valuesl^ Referring to Frege's concepts as 

"Da es uns hier darauf ankommt, den Zahlbegriff so zu fassen, wie er fiir die Wissenschaft 
brauchbar ist, so darf es uns nicht storen, dass im Sprachgebrauche des Lebens die Zahl auch 
attributiv erscheint. Das lasst sich immer vermeiden." 

'^■^In fact, the definition of the number one earlier in the paragraph describes, more precisely, 
the construction in the Grundlagen. In the Grundgesetze, he took F to be a /irsi-level concept 
that holds of classes, i.e. extensions of concepts, that contain one element. This is a nice 
illustration of how the "pushing down" trick can be used repeatedly to avoid the use of higher 
types. See Reck [90 [ Section 5] for a discussion of the two definitions, and Burgess for an 
overview of Frege's methodology. 
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"attributes" and their extensions as "classes," Quine described the difference as 
follows: 

Frege treated of attributes of classes without looking upon such dis- 
course as somehow reducible to a more fundamental form treating of 
attributes of attributes. Thus, whereas he spoke of attributes of at- 
tributes as second-level attributes, he rated the attributes of classes 
as of first level; for he took all classes as rock-bottom objects on par 
with individuals. [86, p. 147] 

Frege never got so far as developing mathematical analysis in his system, and 
we cannot say how with certainty how he would have developed, for example, 
ordinary calculus on the real numbers. But there is a strong hint that here, too, 
he would have taken, for example, operations like integration and differentiation 
to operate on extensions, rather than functions, in his system. He touched on 
the history of analysis in his Function and concept of 1891, and noted that, for 
example, differentiation can be understood as a higher-type functionals. 

Now at this point people had particular second-level functions, but 
lacked the conception of what we have called second-level functions. 
By forming that, we make the next step forwards. One might think 
that this would go on. But probably this last step is not so rich in 
consequences as the earlier ones; for instead of second-level functions 
one can deal, in further advances, with first-level functions — as shall 
be shown elsewhere [45t p. 31] 

Presumably, he had the method of replacing functions by their extensions in 
mind. 

Notice, incidentally, that Frege's method of representing mathematical func- 
tions as courses-of-values has the effect that mathematical functions are treated 
extensionally. For example, defining the integral as an operation that applies 
to a course-of-values means that integration cannot distinguish mathematical 
functions that are extensionally equal, since any two descriptions of a function 
that satisfy extensional equality have the same course-of-values, by Basic Law 
V. 

There are other interesting features of Frege's treatment of functions and 
higher-type objects that push us away from identifying them with the functions 
of ordinary mathematics. For example, for Frege, every function has to be 
defined on the entire domain of individuals; even if one is interested in the 
exponential function on the real numbers, one has to specify a particular (but 
arbitrary) value of this function for every object in existence. And the separation 
of functions and objects has other effects on the system. There is only one basic 

"Damit hatte man nun einzelne Funktionen zweiter Stufe, ohne jedoch das zu erfassen, was 
wir Funktion zweiter Stufe genannte haben. Indem man dies tut, macht man den nachsten 
Fortschritt. Man konnte denken, dass dies so weiter ginge. Wahrscheinlich ist aber schon 
dieser letzte Schritt nicht so folgenreich wie die friiheren, weil man statt der Funktionen 
zweiter Stufe im weiteren Fortgang Funktionen erster Stufe betrachten kann, wie an einem 
anderen Orte gezeigt werden soil." 
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type, so, for example, truth values live alongside everything else. There is no 
notion of identity between higher-type objects — the equality symbol can only be 
applied to equality between objects — even though Frege pointed out that one 
can define the notion of "sameness" of functions and concepts extensionally, 
and noted that this acts very much like identity. Frege's system, of course, 
includes the axiom of universal instantiation. In contemporary notation, this 
would be expressed as Va; ^{x) — > f{a) where x is a variable ranging over 
individuals and a is any individual term. It also includes the corresponding 
axiom VF (p{F) ^{A), where F ranges over functions from objects to objects. 
But, remarkably, the system does not include analogous axioms for higher-type 
objects: the "pushing down trick" obviates the need for these. 

All things considered, Frege's foundational treatment of mathematics seems 
much closer to modern set-theoretic treatments, where there is one homogeneous 
universe of individuals. Truth values are individuals, numbers are individuals, 
mathematical sequences and series are individuals — all bona-fide mathematical 
objects are individuals. Functions play a mere linguistic role in the syntax of the 
system, providing the axioms and rules enough flexibility to get the job done. 
They are not the cogs in the system, but, rather, the oil the keeps the wheels 
turning. As Panza puts it: 

. . . according to Frege, appealing to functions is indispensable in or- 
der to fix the way his formal language is to run, but functions are not 
as such actual components of the language. More generally, func- 
tions manifest themselves in our referring to objects — either concrete 
or abstract — and making statements about them, but they are not 
as such actual inhabitants of some world of concreta and abstracta. 
Briefly: Frege's formal language, as well as our ordinary ones, dis- 
play functions, but there are no functions as such. j83[ p. 14] 

10 Foundational concerns 

We have seen that a curious tension lies at the core of Frege's formal repre- 
sentation of mathematics. On the one hand, Frege asserted, repeatedly, that 
functions, in the logical and linguistic sense, are not objects. On the other hand, 
when it comes to formalizing mathematical constructions, he clearly felt that 
functions, in the mathematical sense, have to be objects. His course-of-values 
operator, together with his Basic Law V, allowed him to have his cake and eat 
it too, maintaining clear borders between the two realms while passing between 
them freely. But Frege is often taken to task for failing to realize that this 
strategy opens the door to Russell's paradox. Indeed, the strategy feels like a 
hack, a desperate attempt to satisfy the two central constraints. Why was he 
so committed to them? The goal of this section is to suggest that the concerns 
Frege was trying to address with the design of the logic of the Grundgesetze 
parallel some of the informal mathematical concerns we were able to discern in 
the nineteenth century treatment of characters. 
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When one considers the philosophical and logical considerations that went 
into the design of Frege's logic, two possibilities come to mind. One is that Frege 
determined that functions and objects should be separate on broad metaphysical 
grounds, and then designed the logic accordingly. The other is that he designed 
the logic, determined it worked out best with a separation of individuals and 
functions, and read off the metaphysical stance from that. But, in fact, there 
is no clear distinction between these two descriptions. Frege designed his logic 
to try to model scientific practice at its best, and account for and support its 
successes while combating and eliminating confusions. The examples in the 
previous section show that Frege had no qualms about reinterpreting ordinary 
locutions and reconstruing everyday language, so he was by no means slave to 
naive metaphysical intuitions. But even when doing so he appealed to intuitions 
to convince us that the reconstruals are reasonable. Thus "doing metaphysics" 
meant analyzing the practice, sorting out intuitions, and trying to regiment 
and codify them in a coherent and effective way. From the other direction, 
"getting the logic to work" meant being able to account for the informal practice 
effectively and efficiently, and supporting our intuitions to the extent that they 
can be fashioned into a coherent system. So it is not a question as to whether the 
metaphysics or the logic comes first; working out the metaphysics and designing 
the logic are part and parcel of the same enterprise. The following questions 
therefore seem more appropriate: 

1. What considerations pushed Frege to maintain the sharp distinction be- 
tween function and object? 

2. What considerations pushed Frege to identify mathematical entities, in- 
cluding ordinary mathematical funtions, as objects? 

Let us consider each in turn. 

It seems to us that the answer to the first question is simply that Frege felt 
that failure to respect the distinction results in linguistic confusion. 

If it were correct to take "one man" in the same way as "wise man," 
we should be able to use "one" also as a grammatical predicate, and 
to be able to say "Solon was one" just as much as "Solon was wise." 
It is true that "Solon was one" can actually occur, but not in a way 
to make it intelligible on its own in isolation. It may, for example, 
mean "Solon was a wise man," if "wise man" can be supplied from 
the context. In isolation, however, it seems that "one" cannot be a 
predicate. This is even clearer if we take the plural. Whereas we 
can combine "Solon was wise" and "Thales was wise" into "Solon 
and Thales were wise," we cannot say "Solon and Thales were one." 
But it is hard to see why this should be impossible, if "one" were a 
property both of Solon and of Thales in the same way that "wise" 
isEl [g §29] 

75 "Wenn 'Ein Mensch' ahnlich wie 'weiser Mensch' aufzufasen ware, so soUte man denken, 
dass, 'Ein' auch als Praedicat gebraucht werden konnte, sodass man wic 'Solon war weise' 
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In other words, even though in some contexts an object word like "one" can 
appear to be used as a predicate, and in other contexts a concept can appear 
to be used as a subject, closer inspection shows that these uses do not conform 
to the rules that govern the use of prototypical subjects and predicates, and so 
should not be categorized in the naive way. 

One of Frege's favorite pastimes was to show that assertions made by philo- 
sophical and mathematical colleagues degenerate into utter nonsense when they 
fail to maintain sufBcient linguistic hygiene. For example, in his 1904 essay, 
"What is a Function," Frege was critical of conventional mathematical accounts 
of variables and functions. It is a mistake, he said, to think of a variable as 
being an object that varies: 

... a number does not vary; for we have nothing of which we could 
predicate the variation. A cube never turns into a prime number; 
an irrational number never becomes rationally [48, p. 658] 

He took the mathematician Emanuel Czuber to task for giving such a sloppy 
account of variables and functions in an introductory mathematical text. For 
example, he criticized Czuber's terminology "a variable assumes a number" [481 
288] as being incomprehensible. On Czuber's account, a variable is an "indefinite 
number," so the terminology can be rephrased "an indefinite number assumes a 
(definite) number" ; but where we may talk about an object assuming a property, 
what can it mean for an object to assume another object? 

In other connections, indeed, we say that an object assumes a prop- 
erty, here the number must play both parts; as an object it is called 
a variable or a variable magnitude, and as a property it is called a 
value. That is why people prefer the word "magnitude" to the word 
"number" ; they have to deceive themselves about the fact that the 
variable magnitude and the value it is said to assume are essentially 
the same thing, that in this case we have not got an object assuming 
different properties in succession, and that therefore there can be no 
question of a variation^ 48, p. 660-661] 

auch sagen konnte 'Solon war Ein' oder 'Solon war Einer'. Wenn nun der letzte Ausdruck 
auch vorkommen kann, so ist er doch diir sich allein nicht verstandlich. Er kann z.B. heissen: 
Solon war ein Weiser, wenn 'Weiser' aus dem Zusammenhange zu erganzen ist. Aber allein 
scheint 'Ein' nicht Praedicat sein zu konnen. Noch deutlicher zeigt sich dies beim Plural. 
Wahrend man 'Solon war weise' und 'Thales war weise' zusammenziehen kann in 'Solon und 
Thales waren weise,' kann man nicht sagen 'Solen und Thales waren Ein'. Hiervon ware die 
Unmoglichkeit nicht einzusehen, wenn 'Ein' sowie 'weise' eine Eigenschaft sowohl des Solon 
als auch des Thales ware" . 

"^^ "Folglich verandert sich die Zahl gar nicht; denn wir haben nichts, von dem wir die 
Veriinderung aussagen konnten. Eine Kubikzahl wird nie zu einer Primzahl, und eine Irra- 
tionalzahl wird nie rational." 

"^"^ "Sonst sagt man wohl, daB ein Gegenstand eine Eigenschaft annehme; heir muB die Zahl 
beide RoUen spielen; als Gegenstand wird sie Variable oder veranderliche GroBe, als Eigen- 
schaft wird sie Wert genannt. Darum also zieht man das Wort 'GroBe' dem Worte 'Zahl'ert, 
den sie angeblich annimmt, im Grunde dasselbe sind, daB man gar nicht den Fall hat, wo ein 
Gegenstand nacheinander verschiedene Eigenschaften annimmt, daB also von Veriinderung in 
keiner Weise die Rede sein kann." 
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The essay closes with the following assessment: 

The endeavor to be brief has introduced many inexact expressions 
into mathematical language, and these have reacted by obscuring 
thought and producing faulty definitions. Mathematics ought prop- 
erly to be a model of logical clarity. In actual fact there are perhaps 
no scientific works where you will find more wrong expressions, and 
consequently wrong thoughts, than in mathematical ones. Logi- 
cal correctness should never be sacrificed to brevity of expression. 
It is therefore highly important to devise a mathematical language 
that combines the most rigorous accuracy with the greatest possible 
brevity. To this end a symbolic language would be best adapted, 
by means of which we could directly express thoughts in written or 
printed symbols without the intervention of spoken language!^ [48l 
p. 665] 

Frege aimed to give a clear account of the rules that govern proper logical 
reasoning. Although, in ordinary language, the line between concepts and ob- 
jects is sometimes blurry, failure to diagnose and manage the blurriness open 
the door to nonsensical reasoning. Even though words like "one" and "horse" 
sometimes seem to denote both concepts and objects, confiating the two causes 
problems. For Frege, the only viable solution was to analyze and regiment such 
uses in a way that cordons off problematic instances. He found that the best 
way to do this is to maintain a clear separation of concept and object, and then 
supplement the analysis with an explanation as to how some words seem to 
cross the divide in certain contexts. 

Now let us turn to the second question: why was Frege so dogged in his 
insistence that mathematical entities like numbers have to be treated as objects, 
and so persistent, in practice, in pushing mathematical constructions down to 
that realm? We believe that the answer lies in an observation that we found in 
Heck [S^: Frege wanted his numbers to be able to count all sorts of entities, 
and the only way he could make that work way by treating all these entities as 
inhabitants of the same type. Consider the following statements: 

• There are two truth values. 

• There are two natural numbers strictly between 5 and 8. 

• There are two constant functions taking values among the truth values. 

"^^ "Das Streben nach Kiirze hat viele ungenaue Ausdriicke in die mathematische Sprache 
eingefiihrt, und diese haben riickwirkend die Gedanken getriibt und fehlerhafte Definitio- 
nen zuwege gebracht. Die Mathematik soUte eigentlich ein Muster von logischer Klarheit 
sein. In Wirklichkeit wird man vielleicht in den Schriften keiner Wissenschaft mehr schiefe 
Ausdriicke und infolgedessen mehr schiefe Gedanken findcn als in den mathematischen. 
Niemals soUte man die logische Richtigkeit der Kiirze des Ausdrucks opfern. Deshalb ist es von 
groBcr Wichtigkeit, eine mathematische Sprache zu schaffen, die mit strcngster Genauigkeit 
moglichste Kiirze verbindet. Dazu wird wohl am besten eine Begriflsschrift geeignet sein, 
ein Ganzes von Regeln, nach denen man durch geschriebene oder gedruckte Zeichen ohne 
Vermittlung des Lautes unmittelbar Gedanken auszudriickon vermag." 
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• There are two characters on (Z/4Z)*. 

• There are two subsets of a singleton set. 

Frege would have insisted that the word "two" in each of these statements refers 
to the same object. We would like to say that the number of truth values is 
equal to the number of natural numbers between 5 and 8, but if truth values 
and numbers were different types of entities, his analysis of number would not 
do that. The problem is that Frege's grammar requires a different notion of 
equinumerosity for each pair of types. In other words, for any two types a and 
T, there is an expression Eq^.^ that expresses equinumerosity between concepts 
of entities of those types. Similarly, for each type cr, there is a concept TwOo- 
that holds of concepts of arguments of type a under which two elements fallF^ 
Taking extensions of each of these, following Frege's construction, would yield 
an object 2^ for each type a. But this results in a proliferation of twos, and 
since 2„ and l-r are not guaranteed to be the same object, one would have to 
exercise great care when reasoning about the relationship between them. This 
is clearly unworkable. 

Instead, Frege designed his numbers to count objects, simpliciter: 2 is the 
extension of the concept of being a concept of objects under which exactly two 
objects falllf^ But this means that if you want to count a collection of things, 
those things have to be elements of the type of objects. This, in turn, provides 
a strong motivation to locate mathematical entities of all kinds among the type 
of objects. 

We have seen in Sections H] to [8] that what holds true of counting holds true 
of other mathematical operations, relations, and constructions as well. Contem- 
porary proofs of Dirichlet's theorem have us sum over finite sets of characters 
just as we sum over finite sets of numbers. We view the general operation here 
as summation over a finite set of objects, viewing both characters and numbers 
as such. Contemporary proofs also have us consider groups of characters, just 
as we consider groups of residues. Once again, we consider these as instances 
of the general group concept, with the understanding that a group's underlying 
set can be any set of objects. This allows us to speak over a homomorphism be- 
tween any two groups, without requiring a different notion of "homomorphism" 
depending on the type of objects of the groups' carriers. 

Characters were not the only mathematical entities studied in the latter 
half of the nineteenth century that encouraged set-theoretic reification. Gauss' 
genera of quadratic forms, discussed in Section [TTTl also bear a group-theoretic 
structure, and these are sets of quadratic forms. Dedekind developed his theory 
of ideals in order to supplement rings of algebraic integers with "ideal divisors," 
extending the unique factorization property of the ordinary integers to these 
more general domains. Dedekind found that these ideal divisors could be iden- 
tified with sets of elements in the original ring, now known as "ideals." Like the 

79jjj Pi-ege's system, for a anything other than the type of objects, this has to be expressed 
in terms of a "sameness" relation for elements of type cr, since the equality relation only applies 
to objects. 

*OSee footnote [73] 
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characters, the ideals of a ring of algebraic integers bear an algebraic structure, 
and Dedekind was adamant that they should be treated as bona-fide obiectsrH 
Similarly, Dedekind constructed the real numbers by indentifying each of them 
with a pair of sets of rational numbers |21) . By the end of the century, it was 
common to view a quotient group as a group whose elements are equivalence 
classes, or cosetsjfl 

The reasons given above to treat mathematical functions and sets as objects 
also speak in favor of treating them extensionally. The statements that "there 
are two characters on Z/2Z" and "there are (p{m) characters on Z/mZ" are 
false if we take characters to be representations, as there are many different 
representations of the same character. We could, of course, develop notions of 
"counting up to equivalence." In the early days of finite group theory, Camille 
Jordan described quotient groups as systems just like ordinarygroups except 
that equality is replaced by an appropriate equivalence relation|3 But, if we do 
that, mathematical statements become "relativized" to the appropriate equiva- 
lence relations, which constitute additional information that needs to be carried 
along and managed. The alternative is to extensionalize: then the only equiva- 
lence relation one has to worry about is equality. 

We do not know the extent to which Frege was familiar with examples like 
these. But Wilson [10311104] calls attention to an important example of ab- 
straction with which Frege was quite well acquainted. Frege was trained as a 
geometer, and studied under Ernst Schering in Gottingen. His dissertation, com- 
pleted in 1893, was titled "Uber eine geometrische Darstellung der imaginaren 
Gebilde in der Ebene" ("On a geometric representation of imaginary forms in 
the plane" ) . Early nineteenth century geometers found great explanatory value 
and simplification in extending the usual Euclidean plane with various ideal ob- 
jects, like "points at infinity" and "imaginary" points of intersection. One of the 
few motivating examples that Frege provided in the Grundlagen (§64-§68) is 
the fact that one can identify the "direction" of a line a in the plane with the ex- 
tension of the concept "parallel to a." As Wilson points out (though Frege does 
not), these "directions" are exactly what is needed to serve as points at infinity, 
enabling one to embed the Euclidean plane in the larger projective plane, which 
has a number of pleasing properties. In the projective plane, all points have 
equal standing, and so it stands to reason that the concept-extensions used to 
introduce the new entities should be given the same ontological rights as the Eu- 
clidean points and lines used in their construction. Wilson characterizes such 
strategies for expansion as forms of "relative logicism," since they provide a 
powerful means of relating the newly-minted objects to the more familiar ones. 

When it comes down to the nitty-gritty details, however, the only sustained 
formal development we have from Frege is his treatment of arithmetic. But even 
in this particular case, many of the issues we have raised come to the fore. In 

*^See Avigad [2], especially page 172, and Edwards |31j . 

*^See the detailed discussion in Schlimm |93l Section 3]. Other nice examples of pieces of 
nineteenth century mathematics that push in favor of set-theoretic abstraction are discussed 
in Wilson [T03l[T0il . 

Again, see Schlimm 1931 Section 3]. 
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the Grundgesetze, Frege defined a number of general operations and relations 
on tuples, sequences, functions, and relations. All of these now can be viewed 
as general set-theoretic constructions. What gives these constructions universal 
validity is that they can be applied to any domain of objects, and we now have 
great latitude in creating objects, as they are needed, to populate these domains. 
It is precisely the ability to bring a wide variety of mathematical constructions 
into the realm of objects, and the ability to define predicates and operations 
uniformly on this realm, that renders Frege's logic so powerful — too powerful, 
alas. But given Frege's goals, it should be clear why the extension operator held 
so much a ppeallB 

To sum up, we have traced a central tension in Frege's work to the need to 
balance two competing desiderata: 

1. the need for flexible but rigorous ways of talking about higher-type objects, 
like functions, predicates, and relations, without falling prey to incoher- 
ence; and 

2. the need for ways of dealing with mathematical objects uniformly, since 
mathematical constructions and operations have to be applied to many 
sorts of objects, many of which cannot be foreseen in advance. 

As we have already noted, Frege is often faulted for missing the simple inconsis- 
tency that arises from the formal means he introduced to resolve this tension. 
Nonetheless, it is worth highlighting the extent to which these two concerns were 
central to the subsequent development logic and foundations. Russell's paradox 
shows that Frege was perfectly right to worry that an overly naive treatment of 
functions, concepts, and objects would lead to problems in the most fundamen- 
tal use of our language and methods of reasoning. And, going into the twentieth 
century, developments in all branches of mathematics called for liberal means 
of constructing new mathematical domains and structures, as well as uniform 
ways of reasoning about their essential structural properties. The most fruitful 
and appropriate means of satisfying these needs was by no means clear at the 
turn of the twentieth century. Indeed, these issues were at the heart of the 
tumultuous foundational debates that were looming over the horizon. 

*'*The same uniformity is achieved in set theory is achieved by having a large universe of sets, 
and incorporating set-forming operations which return new elements of that universe. Russell 
introduced the notion of typical ambiguity I40II91I to allow "polymorphic" operations defined 
uniformly across types, and modern interactive theorem provers based on simple type theory 
follow such a strategy to obtain the necessary uniformities. For example, most such systems 
have operations card a which maps a finite set of elements of type a to its cardinality, a natural 
number. The systems include mechanisms that allow one to define this family of operations 
uniformly, once and for all, treating ct as a parameter. One can then write card A, and let 
the system infer the relevant type parameter from the type of A. This provides one means 
of coping with the nonuniformities that arise from a type-theoretic compartmentalization of 
the mathematical universe, but the difficulties that accrue to taking simple type theory as a 
mathematical foundation are complex; see, for example, [6]. 
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11 Conclusions 



In 1914, Felix Hausdorff published his Grundziige der Mengenlehre |55| . Dedi- 
cated to Georg Cantor, the book established a modern set-theoretic foundation, 
and used it to support the development of point-set topology. After discussing 
the notion of an ordered pair, Hausdorff gave the concept of function its modern 
definition: 

... we consider a set P of such pairs, satisfying the condition that 
each element of A occurs as the first element of one and only one pair 
p of P. In this way, each element a of A determines a unique element 
b, namely, the one to which it is connected in a pair p = (a, b). We 
denote this element associated to a, which is determined by and 
dependent on a, by 

b = f{a), 

and we say that on A (i.e. for all elements of A) a unique function of 
a is defined. We view two such functions f{a),f'{a) as equal when 
and only when the corresponding sets of pairs P, P' are equal, that 
is, when, for each a, f{a) = /'(a)|f£| p. 33] 

Work in functional analysis in the 1910's and 1920's helped to cement the mod- 
ern view. In 1859, George Boole published A Treatise on Differential Equa- 
tions [9| , in which he introduced the subject as a study of "variable quantities" 
subject to "known" relations between their differential coefficients. That work 
is noteworthy for its observation that differential operators can be viewed as 
algebraic expressions subject to certain laws. Integration was also viewed ab- 
stractly; in the expression y = / Lp{x)dx + c, 

the symbol J denotes a certain process of integration, the study of 
the various forms and conditions of which is, in a peculiar sense, the 
object of this part of the Integral Calculus. [9- p. 2] 

Modern functional analysis takes the view that operations on functions like dif- 
ferentiation and integration can themselves be viewed as functions, defined over 
spaces of other functions. In 1887, Vito Volterra published a seminal paper, 
"Sopra le funzioni che dipendono de altre funzioni" ( "On functions that depend 
on other functions") [53], which adopted such a viewpoint. In 1901, Hadamard 

*® ". . . betrachten wir eine Menge P solcher Paare, und zwar von der Beschaffenheit, dafi 
jedes Element a von A in einem und nur einem Paare p von P als erstes Element auftritt. 
Jedes Element a bestimmt auf diese Weise ein und nur ein Element b, namlich dasjenige, 
mit dem es zu einem Paare p = (a, b) verbunden auftritt; dieses durch a bestimmtc, von a 
abhangige, dem a zugeordnete Element bezeichnen wir mit 

b = f{a) 

und sagen, dafi hiermit in A (d. h. fiir alle Elemente von A) eine eindeutige Funktion von a 
definiert sei. Zwei solche Funktionen /(a), /'(a) sehen wir dann und nur dann als gleich an, 
wenn die zugehoringen Paarmengen P,P' gleich sind, wenn also, fiir jedes a, f{a) = /'(a) 
ist." 



75 



published Legons sur le calcul des variations, which helped establish the foun- 
dations of functional analysis, and, in fact, introduced the term "functional." 
In 1930, Volterra published an English translation of a series of lectures he had 
given at the University of Madrid in 1925 ^S], which began with a lengthy 
discussion of the notion. After presenting particular examples, he said: 

We shall therefore say that a quantity z is a functional of the function 
x{t) in the interval (a, b) when it depends on all the values taken by 
x(t) when t varies in the interval (a, 6); or, alternatively, when a law 
is given by which to every function x{t) defined within (a, 6) (the in- 
dependent variable within a certain function field) there can be made 
to correspond one and only one quantity z, perfectly determined, and 
we shall write 

b ' 

x{t) 



This definition of a functional recalls especially the ordinary general 
definition of a function given by Dirichlet. 99, p. 4] 

The chapter as a whole describes an outlook that is essentially the contemporary 
mathematical viewpoint, yet in a way that makes clear that the viewpoint was 
one that the mathematical community was still getting used toF^ 

Thus, by 1925, the modern view of a function was firmly in place. A function 
could map elements of any domain to any other; one could specify a function 
by specifying any determinate law; functions could serve as arguments to other 
functions; and functions could serve as elements of algebraic structures and 
geometric spaces. The famous 1905 debate among the French analysts as to 
whether it makes sense to consider "arbitrary functions," not given by any rule 
or law, was already two decades in the pastPl 

Our study of the treatment of characters in number theory has focused on 
only one small part of the grand historical development that resulted in this 
modern way of thinking. Despite its narrow focus, the case study has illuminated 
some important factors that contributed to the modern view. In particular, we 
have emphasized the difficulties inherent in developing a coherent, rigorous way 
of treating functions as objects, as well as concerns as to how one could do so 
in a way that is appropriate to mathematics and preserves all the information 
that is essential to a rigorous argument. We have also explored some of the 
benefits associated with the methodological changes. Many of these accrue to 
the ability of the new language and notation to highlight central features and 
relationships of the objects in question, while suppressing details that make it 
harder to discern the high-level structure of a mathematical argument. We also 
emphasized the importance of representing mathematical concepts and objects 

*^See also the Griffith Evans' helpful introduction to the 1959 Dover reprinting of Volterra's 
lectures 1991 . 

*^The "five letters" are translated as an appendix to Moore |80| , reproduced in Ewald I39| . 
volume 2, pages 1077-1086. 
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in order to capture those features that are uniform across different domains of 
argumentation, so that these uniformities can be packaged and used in a modular 
way. We noted that many of these concerns, especiahy concerns about coherence 
and uniformity, are present in Frege's logical analysis of the function concept, 
despite the fact that Frege's focus was more foundational than mathematical. 

Finally, we have argued that understanding and evaluating the considera- 
tions that shape mathematical language and inferential practice is an important 
part of studying the metaphysics of mathematics. Whether or not one deems 
it appropriate to maintain a sharp distinction between a priori and empirical 
aspects of our scientific practice, the best way to justify our ontological posits is 
by understanding what they do for us, and how they help us attain our scientific 
goals. Moreover, we can discern among these scientific goals some that are pro- 
totypically mathematical. One of these is the goal of maintaining an inferential 
practice with clear rules and norms, one that allows its practitioners to carry 
out, communicate, and evaluate arguments that can become exceedingly long 
and complex. Another is the goal of promoting efficiency of thought, leverag- 
ing whatever features we can to extend our cognitive reach and transcend our 
cognitive limitations. The challenge remains that of developing appropriately 
philosophical means of coming to terms with such "pragmatic" and "cognitive" 
considerations Ifl To that end, we should resist simplistic accounts: the touch- 
stone must always be mathematical practice itself, with all its richness and 
complexity. 
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